Keywords - Function groups - @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Group: Cluster analysis
See also: tree agglom

Function: kmeans
Description: performs cluster analysis, i.e. computes a partition of n row points into K clusters.

Link:
Usage: ckm = kmeans (x, b, it, {w, {m}})
Input:
x n x p matrix data matrix
b n x 1 matrix: Initial partition (for example random generated numbers of clusters 1,2,...,K
it maximal number of iterations
w p x 1 matrix with the weights of column points
m n x 1 matrix of weights (masses) of row points
Output:
cm.g n x 1 matrix containing the final partition which gives a minimum sum of within cluster variances
cm.c k x p matrix of means (centroids) of the K clusters
cm.v k x p matrix of within cluster variances divided by the weight (mass) of clusters
cm.s k x 1 matrix of the weight (mass) of clusters

Note:

Example:
; set the seed of the random generator
randomize(0)
; generate some data
x  = normal(100, 4)
; generate first cluster
x1 = x - #(2,1,3,0)'
; generate second cluster
x2 = x + #(1,1,3,1)'
; generate third cluster
x3 = x + #(0,0,1,5)'
; make a data set with 3 clusters
x  = x1|x2|x3
; generate a random partition with 3 clusters
b  = ceil(uniform(rows(x)).*3)
; apply k-means clustering to the data
{g, c, v, s} = kmeans(x, b, 100)
; show the startpartition and the final partition
b~g

Result:
shows as result the start and the final partition of the data in 3 clusters



Contents of _tmp



[  1,]        1        2

[  2,]        3        2

[  3,]        1        2

[  4,]        3        2

[  5,]        3        2

[  6,]        2        2

[  7,]        3        2

[  8,]        2        2

[  9,]        2        2

[ 10,]        3        2

[ 11,]        1        2

[ 12,]        2        2

[ 13,]        2        2

[ 14,]        3        2

[ 15,]        2        2

...

[286,]        2        1

[287,]        1        1

[288,]        1        1

[289,]        3        1

[290,]        2        1

[291,]        2        1

[292,]        3        1

[293,]        2        1

[294,]        3        1

[295,]        3        1

[296,]        3        1

[297,]        1        1

[298,]        2        1

[299,]        1        1

[300,]        2        1



(C) MD*TECH Method and Data Technologies, 05.02.2006