XploRe Help : agglom

Keywords - Function groups - @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Group:	Cluster analysis
See also:	tree kmeans

Function:		agglom
Description:		performs hierarchical cluster analysis.

Usage:	`cagg = agglom (d, method, no{, opt})`
Input:
	`d`	n x 1 vector or l x l matrix of distances
	`method`	string, one of: "WARD", "SINGLE", "COMPLETE", "MEAN_LINK", "MEDIAN_LINK", "AVERAGE", "CENTROID" or "LANCE".
	`no`	scalar, number of clusters
	`opt`	optional argument for some methods - see note below
Output:
	`cagg.p`	l x 1 matrix with partition numbers (1,2,...)
	`cagg.t`	p x 2 matrix with the dendrogram for no clusters
	`cagg.g`	p x 2 matrix with the dendrogram for all l clusters
	`cagg.pd`	l x 1 matrix with with partition numbers (1,2,...)
	`cagg.d`	no x (no-1)/2 matrix with distances between the cluster centers

Note:

The options (linkage strategies) for method are: WARD, SINGLE, COMPLETE, MEAN_LINK, MEDIAN_LINK, AVERAGE, CENTROID, LANCE.

The optional parameter opt is either a scalar parameter for the LANCE method (beta) or a (l x 1) vector of weights for the methods AVERAGE, CENTROID and WARD. For all other methods an error results if an optional parameter is specified.

Example:

proc()=main() ; load the swiss banknote data x=read("bank2") ; compute the euclidean distance between banknotes i=0 d=0.*matrix(rows(x),rows(x)) while(i.<cols(x)) i = i+1 d = d+(x[,i] - x[,i]')^2 endo d = sqrt(d) ; use the WARD method to cluster the data t = agglom(d, "WARD", 3) t.p endp ; main()

Result:

//gives the partition of the data into 3 clusters
Contents of p
[  1,]        1
[  2,]        1
[  3,]        1
[  4,]        1
[  5,]        1
[  6,]        1
[  7,]        1
...
[ 98,]        1
[ 99,]        1
[100,]        1
[101,]        2
[102,]        2
[103,]        2
[104,]        2
[105,]        3
[106,]        2
[107,]        2
...
[194,]        2
[195,]        3
[196,]        2
[197,]        2
[198,]        2
[199,]        2
[200,]        2