classify (Statistics Toolbox)

Discriminant Analysis

Syntax

class = classify(sample,training,group)
[class,err] = classify(...)
[...] = classify(...,'type')
[...] = classify(...,'type',prior)

Description

class = classify(sample,training,group) classifies each row of the data in sample into one of the groups in training. sample and training must be matrices with the same number of columns. group is a grouping variable for training. Its unique values define groups, and each element defines which group the corresponding row of training belongs to. group can be a numeric vector, a string array, or a cell array of strings. training and group must have the same number of rows. classify treats NaNs or empty strings in group as missing values, and ignores the corresponding rows of training. class indicates which group each row of sample has been assigned to, and is of the same type as group.

[class,err] = classify(...) also returns an estimate of the misclassification error rate. classify returns the apparent error rate, i.e., the percentage of observations in the training that are misclassified.

[...] = classify(...,'type') allows you to specify the type of discriminant function, as one of:

'linear'
Fits a multivariate normal density to each group, with a pooled estimate of covariance (default).

'quadratic'
Fits MVN densities with covariance estimates stratified by group.

'mahalanobis'
Uses Mahalanobis distances with stratified covariance estimates.

`'linear'`	Fits a multivariate normal density to each group, with a pooled estimate of covariance (default).
`'quadratic'`	Fits MVN densities with covariance estimates stratified by group.
`'mahalanobis'`	Uses Mahalanobis distances with stratified covariance estimates.

[...] = classify(...,'type',prior) enables you to specify prior probabilities for the groups in one of three ways. prior can be:

A numeric vector of the same length as the number of unique values in group. If group is numeric, the order of prior must correspond to the sorted values in group, or, if group contains strings, to the order of first occurrence of the values in group.
A 1-by-1 structure with fields:

prob
A numeric vector

group
Of the same type as group, and containing unique values indicating which groups the elements of prob correspond to.

`prob`	A numeric vector
`group`	Of the same type as `group`, and containing unique values indicating which groups the elements of `prob` correspond to.

As a structure, prior can contain groups that do not appear in group. This can be useful if training is a subset a larger training set.

The string value 'empirical', indicating that classify should estimate the group prior probabilities from the group relative frequencies in training.

prior defaults to a numeric vector of equal probabilities, i.e., a uniform distribution. prior is not used for discrimination by Mahalanobis distance, except for error rate calculation.

Examples

load discrim
sample = ratings(idx,:);
training = ratings(1:200,:);
g = group(1:200);
class = classify(sample,training,g);
first5 = class(1:5)
first5 =
     2
     2
     2
     2
     2

See Also
mahal

References

[1] Krzanowski, W.J., Principles of Multivariate Analysis, Oxford University Press, Oxford, 1988.

[2] Seber, G.A.F., Multivariate Observations, Wiley, New York, 1984

chi2stat cluster