Statistics Toolbox | ![]() ![]() |
Syntax
class = classify(sample,training,group) [class,err] = classify(...) [...] = classify(...,'type
') [...] = classify(...,'type
',prior)
Description
class = classify(sample,training,group)
classifies each row of the data in sample
into one of the groups in training
. sample
and training
must be matrices with the same number of columns. group
is a grouping variable for training
. Its unique values define groups, and each element defines which group the corresponding row of training
belongs to. group
can be a numeric vector, a string array, or a cell array of strings. training
and group
must have the same number of rows. classify
treats NaN
s or empty strings in group
as missing values, and ignores the corresponding rows of training
. class
indicates which group each row of sample
has been assigned to, and is of the same type as group
.
[class,err] = classify(...)
also returns an estimate of the misclassification error rate. classify
returns the apparent error rate, i.e., the percentage of observations in the training
that are misclassified.
[...] = classify(...,'
allows you to specify the type of discriminant function, as one of:type
')
[...] = classify(...,'
enables you to specify prior probabilities for the groups in one of three ways. type
',prior)
prior
can be:
group
. If group
is numeric, the order of prior
must correspond to the sorted values in group
, or, if group
contains strings, to the order of first occurrence of the values in group
.
prob |
A numeric vector |
group |
Of the same type as group , and containing unique values indicating which groups the elements of prob correspond to. |
prior
can contain groups that do not appear in group
. This can be useful if training
is a subset a larger training set.
'empirical'
, indicating that classify
should estimate the group prior probabilities from the group relative frequencies in training
.
prior
defaults to a numeric vector of equal probabilities, i.e., a uniform distribution. prior
is not used for discrimination by Mahalanobis distance, except for error rate calculation.
Examples
load discrim sample = ratings(idx,:); training = ratings(1:200,:); g = group(1:200); class = classify(sample,training,g); first5 = class(1:5) first5 = 2 2 2 2 2
See Also
mahal
References
[1] Krzanowski, W.J., Principles of Multivariate Analysis, Oxford University Press, Oxford, 1988.
[2] Seber, G.A.F., Multivariate Observations, Wiley, New York, 1984
![]() | chi2stat | cluster | ![]() |