Statistics Toolbox    
cmdscale

Classical multidimensional scaling

Syntax

Description

Y = cmdscale(D) takes an n-by-n distance matrix D, and returns an n-by-p configuration matrix Y. Rows of Y are the coordinates of n points in p-dimensional space for some p < n. When D is a Euclidean distance matrix, the distances between those points are given by D. p is the dimension of the smallest space in which the n points whose interpoint distances are given by D can be embedded.

[Y,e] = cmdscale(D) also returns the eigenvalues of Y*Y'. When D is Euclidean, the first p elements of e are positive, the rest zero. If the first k elements of e are much larger than the remaining (n-k), then you can use the first k columns of Y as k-dimensional points whose interpoint distances approximate D. This can provide a useful dimension reduction for visualization, e.g., for k = 2.

D need not be a Euclidean distance matrix. If it is non-Euclidean or a more general dissimilarity matrix, then some elements of e are negative, and cmdscale choses p as the number of positive eigenvalues. In this case, the reduction to p or fewer dimensions provides a reasonable approximation to D only if the negative elements of e are small in magnitude.

You can specify D as either a full dissimilarity matrix, or in upper triangle vector form such as is output by pdist. A full dissimilarity matrix must be real and symmetric, and have zeros along the diagonal and positive elements everywhere else. A dissimilarity matrix in upper triangle form must have real, positive entries. You can also specify D as a full similarity matrix, with ones along the diagonal and all other elements less than one. cmdscale tranforms a similarity matrix to a dissimilarity matrix in such a way that distances between the points returned in Y equal or approximate sqrt(1-D). To use a different transformation, you must transform the similarities prior to calling cmdscale.

Examples

Generate some points in 4-dimensional space, but "close" to 3-dimensional space, then reduce them to distances only.

Find a configuration with those inter-point distances.

See Also

pdist, procrustes

References

[1]  Seber, G.A.F., Multivariate Observations, Wiley, 1984


  clusterdata combnk