Multi-Class Classification

Multi-Class Classification refers to those classification tasks that have more than two class labels.

Examples include:

  • Face classification.
  • Plant species classification.
  • Optical character recognition.

Unlike binary classification, multi-class classification does not have the notion of normal and abnormal outcomes. Instead, examples are classified as belonging to one among a range of known classes.

The number of class labels may be very large on some problems. For example, a model may predict a photo as belonging to one among thousands or tens of thousands of faces in a face recognition system.

Problems that involve predicting a sequence of words, such as text translation models, may also be considered a special type of multi-class classification. Each word in the sequence of words to be predicted involves a multi-class classification where the size of the vocabulary defines the number of possible classes that may be predicted and could be tens or hundreds of thousands of words in size.

It is common to model a multi-class classification task with a model that predicts a Multinoulli probability distribution for each example.

The Multinoulli distribution is a discrete probability distribution that covers a case where an event will have a categorical outcome, e.g. K in {1, 2, 3, …, K }. For classification, this means that the model predicts the probability of an example belonging to each class label.

Many algorithms used for binary classification can be used for multi-class classification.

Popular algorithms that can be used for multi-class classification include:

  • k-Nearest Neighbors.
  • Decision Trees.
  • Naive Bayes.
  • Random Forest.
  • Gradient Boosting.

Algorithms that are designed for binary classification can be adapted for use for multi-class problems.

This involves using a strategy of fitting multiple binary classification models for each class vs. all other classes (called one-vs-rest) or one model for each pair of classes (called one-vs-one).

  • One-vs-Rest : Fit one binary classification model for each class vs. all other classes.
  • One-vs-One : Fit one binary classification model for each pair of classes.

Binary classification algorithms that can use these strategies for multi-class classification include:

  • Logistic Regression.
  • Support Vector Machine.

We can use the make_blobs() function to generate a synthetic multi-class classification dataset.