Consider a set of training data samples in which each sample is labelled as belonging to one of several pre-specified classes. Data classification then involves the assignment of newly presented data samples to the classes on the basis of mathematical 'models' built for each of the classes.
For
example, here we have five pre-defined classes, and one test point which
must be assigned to one of the classes. Although this particular case is
simple, when data is noisy or classes are not well-separated, perfect classification
becomes impossible. The goal then is to choose the model which minimises
the classification error. For any given classification problem there exists
a fundamental limit to the classification accuracy achievable, and this
minimum error rate is called the "Bayes error"
for that problem.
There are two basic types of
classifier, namely (a) those which attempt to minimise the error rate without
regard to density estimation, such as (i) neural
networks,
(ii) decision trees, and (iii) support
vector machines, and (b) those
which use density estimates to derive a classification, such as (i) nearest-neighbour
methods, (ii) (Gaussian) mixture models,
and (ii) kernel-based methods.
The former methods (a) give only the class assignment, while the latter methods (b) also give the likelihood of a sample belonging to each class. This means that the former methods, despite often giving good classification accuracy, are not recommended for use when accountability is essential (e.g. medical image analysis) or when ranked probabilities are required (e.g. speech recognition). See the Algorithms page for classification research carried out at Seventh Sense Software.