Statistics Toolbox | ![]() ![]() |
Silhouette plot for clustered data
Syntax
silhouette(X,clust) s = silhouette(X,clust) [s,h] = silhouette(X,clust) [...] = silhouette(X,clust,distance) [...] = silhouette(X,clust,distfun,p1,p2,...)
Description
silhouette(X,clust)
plots cluster silhouettes for the n
-by-p
data matrix X
, with clusters defined by clust
. Rows of X
correspond to points, columns correspond to coordinates. clust
can be a numeric vector containing a cluster index for each point, or a character matrix or cell array of strings containing a cluster name for each point. silhouette
treats NaN
s or empty strings in clust
as missing values, and ignores the corresponding rows of X
. By default, silhouette
uses the squared Euclidean distance between points in X
.
s = silhouette(X,clust)
returns the silhouette values in the n
-by-1
vector s
, but does not plot the cluster silhouettes.
[s,h] = silhouette(X,clust)
plots the silhouettes, and returns the silhouette values in the n
-by-1
vector s
, and the figure handle in h
.
[...] = silhouette(X,clust,distance)
plots the silhouettes using the inter-point distance measure specified in distance
. Choices for distance
are:
[...] = silhouette(X,clust,distfun,p1,p2, ...)
accepts a distance function of the form
where X0
is a 1
-by-p
point, X
is an n
-by-p
matrix of points, and p1,p2,...
are optional additional arguments. The function distfun
returns an n
-by-1
vector d
of distances between X0
and each point (row) in X
. The arguments p1
, p2
,...
are passed directly to the function distfun
.
Remarks
The silhouette value for each point is a measure of how similar that point is to points in its own cluster compared to points in other clusters, and ranges from -1 to +1. It is defined as
where a(i)
is the average distance from the i
th point to the other points in its cluster, and b(i,k)
is the average distance from the i
th point to points in another cluster k
.
Examples
X = [randn(10,2)+ones(10,2); randn(10,2)-ones(10,2)]; cidx = kmeans(X,2,'distance','sqeuclid'); s = silhouette(X,cidx,'sqeuclid');
See Also
dendrogram
, kmeans
, linkage
, pdist
References
[1] Kaufman L. and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, 1990
![]() | signtest | skewness | ![]() |