Combining Predictions for Ensemble Learning

A key part of an ensemble learning method involves combining the predictions from multiple models.

It is through the combination of the predictions that the benefit of the ensemble learning method is achieved, namely better predictive performance. As such, there are many ways that predictions can be combined, so much so that it is an entire field of study.

After generating a set of base learners, rather than trying to find the best single learner, ensemble methods resort to combination to achieve a strong generalization ability, where the combination method plays a crucial role.

— Page 67, Ensemble Methods, 2012.

Standard ensemble machine learning algorithms do prescribe how to combine predictions; nevertheless, it is important to consider the topic in isolation for a number of reasons, such as:

  • Interpreting the predictions made by standard ensemble algorithms.
  • Manually specifying a custom prediction combination method for an algorithm.
  • Developing your own ensemble methods.

Ensemble learning methods are typically not very complex and developing your own ensemble method or specifying the manner in which predictions are combined is relatively easy and common practice.

The way that predictions are combined depends on the models that are making predictions and the type of prediction problem.

The strategy used in this step depends, in part, on the type of classifiers used as ensemble members. For example, some classifiers, such as support vector machines, provide only discrete-valued label outputs.

— Page 6, Ensemble Machine Learning, 2012.

For example, the form of the predictions made by the models will match the type of prediction problem, such as regression for predicting numbers and classification for predicting class labels. Additionally, some model types may be only able to predict a class label or class probability distribution, whereas others may be able to support both for a classification task.

We will use this division of prediction type based on problem type as the basis for exploring the common techniques used to combine predictions from contributing models in an ensemble.