There are perhaps tens or hundreds of extensions of bagging with minor modifications to the manner in which the training dataset for each ensemble member is prepared or the specifics of how the model is constructed from the training dataset.
The changes are built around the three main elements of the essential bagging method and often seek better performance by exploring the balance between skillful-enough ensemble members whilst maintaining enough diversity between predictions or prediction errors.
For example, we could change the sampling of the training dataset to be a random sample without replacement, instead of a bootstrap sample. This is referred to as “ pasting .”
- Different Training Dataset : Random subsample of rows.
We could go further and select a random subsample of rows (like pasting) and a random subsample of columns (random subsample) for each decision tree. This is known as “ random patches .”
- Different Training Dataset : Random subsample of rows and columns.
We can also consider our own simple extensions of the idea.
For example, it is common to use feature selection techniques to choose a subset of input variables in order to reduce the complexity of a prediction problem (fewer columns) and achieve better performance (less noise). We could imagine a bagging ensemble where each model is fit on a different “ view ” of the training dataset selected by a different feature selection or feature importance method.
- Different Training Dataset : Columns chosen by different feature selection methods.
It is also common to test a model with many different data transforms as part of a modeling pipeline. This is done because we cannot know beforehand which representation of the training dataset will best expose the unknown underlying structure of the dataset to the learning algorithms. We could imagine a bagging ensemble where each model is fit on a different transform of the training dataset.
- Different Training Dataset : Data transforms of the raw training dataset.
These are a few perhaps obvious examples of how the essence of the bagging method can be explored, hopefully inspiring further ideas. I would encourage you to brainstorm how you might adapt the methods to your own specific project.