Weak Learners in Ensemble Learning

A weak classifier is a model for binary classification that performs slightly better than random guessing. This means that the model will make predictions that are known to have some skill, e.g. making the capabilities of the model weak, although not so weak that the model has no skill, e.g. performs worse than random.

  • Weak Classifier : Formally, a classifier that achieves slightly better than 50 percent accuracy.

A weak classifier is sometimes called a “ weak learner ” or “ base learner ” and the concept can be generalized beyond binary classification.

Although the concept of a weak learner is well understood in the context of binary classification, it can be taken colloquially to mean any model that performs slightly better than a naive prediction method. In this sense, it is a useful tool for thinking about the capability of classifiers and the composition of ensembles.

  • Weak Learner : Colloquially, a model that performs slightly better than a naive model.

More formally, the notion has been generalized to multi-class classification and has a different meaning beyond better than 50 percent accuracy.It is based on formal computational learning theory that proposes a class of learning methods that possess weakly learnability, meaning that they perform better than random guessing. Weak learnability is proposed as a simplification of the more desirable strong learnability, where a learnable achieved arbitrary good classification accuracy.It is a useful concept as it is often used to describe the capabilities of contributing members of ensemble learning algorithms. For example, sometimes members of a bootstrap aggregation are referred to as weak learners as opposed to strong, at least in the colloquial meaning of the term.

More specifically, weak learners are the basis for the boosting class of ensemble learning algorithms. The most commonly used type of weak learning model is the decision tree. This is because the weakness of the tree can be controlled by the depth of the tree during construction.

The weakest decision tree consists of a single node that makes a decision on one input variable and outputs a binary prediction, for a binary classification task. This is generally referred to as a “ decision stump .” It is used as a weak learner so often that decision stump and weak learner are practically synonyms.

  • Decision Stump : A decision tree with a single node operating on one input variable, the output of which makes a prediction directly.

Nevertheless, other models can also be configured to be weak learners. Although not formally known as weak learners, we can consider the following as candidate weak learning models:

  • k-Nearest Neighbors , with k=1 operating on one or a subset of input variables.
  • Multi-Layer Perceptron , with a single node operating on one or a subset of input variables.
  • Naive Bayes , operating on a single input variable.