Statistics Toolbox | ![]() ![]() |
Syntax
Description
T = clusterdata(X, cutoff)
uses the pdist
, linkage
, and cluster
functions to construct clusters from data X
. X
is an m-by-n matrix, treated as m observations of n variables. cutoff
is a threshold for cutting the hierarchical tree generated by linkage
into clusters. When 0 < cutoff < 2
, clusterdata
forms clusters when inconsistent values are greater than cutoff
(see the inconsistent
function). When cutoff
is an integer and cutoff >= 2
, then clusterdata
inteprets cutoff
as the maximum number of clusters to keep in the hierarchical tree generated by linkage
. The output T
is a vector of size m containing a cluster number for each observation.
T = clusterdata(X,cutoff)
is the same as
T = clusterdata(X,'param1',val1,'param2',val2,...)
provides more control over the clustering through a set of parameter/value pairs. Valid parameters are:
'distance' |
Any of the distance metric names allowed by pdist (follow the 'minkowski' option by the value of the exponent p ). |
'linkage' |
Any of the linkage methods allowed by the linkage function |
'cutoff' |
Cutoff for inconsistent or distance measure |
'maxclust' |
Maximum number of clusters to form |
'criterion' |
Either 'inconsistent' or 'distance' |
'depth' |
Depth for computing inconsistent values |
Example
The example first creates a sample dataset of random numbers. It then uses clusterdata
to compute the distances between items in the dataset and create a hierarchical cluster tree from the dataset. Finally, the clusterdata
function groups the items in the dataset into three clusters. The example uses the find
function to list all the items in cluster 2.
rand('seed',12); X = [rand(10,3); rand(10,3)+1.2; rand(10,3)+2.5]; T = clusterdata(X,'maxclust',3); find(T==2) ans = 21 22 23 24 25 26 27 28 29 30
See Also
cluster
, inconsistent
, kmeans
, linkage
, pdist
![]() | cluster | cmdscale | ![]() |