The structure of the bagging procedure can be divided into three essential elements; they are:
- Different Training Datasets : Create a different sample of the training dataset for each ensemble model.
- High-Variance Model : Train the same high-variance model on each sample of the training dataset.
- Average Predictions : Use statistics to combine predictions.
We can map the canonical bagging method onto these elements as follows:
- Different Training Datasets : Bootstrap sample.
- High-Variance Model : Decision tree.
- Average Predictions : Mean for regression, mode for classification.
This provides a framework where we could consider alternate methods for each essential element of the model.
For example, we could change the algorithm to another high-variance technique that has somewhat unstable learning behavior, perhaps like k-nearest neighbors with a modest value for the k hyperparameter.
We might also change the sampling method from the bootstrap to another sampling technique, or more generally, a different method entirely. In fact, this is a basis for many of the extensions of bagging described in the literature. Specifically, to attempt to get ensemble members that are more independent, yet remain skillful.