Gradient boosting

chayan-kathuria · 8 August 2021 12:33

In gradient boosting, an ensemble of weak learners is used to improve the performance of a machine learning model. The weak learners are usually decision trees. Combined, their output results in better models.

In case of regression, the final result is generated from the average of all weak learners. With classification, the final result can be computed as the class with the majority of votes from weak learners.

In gradient boosting, weak learners work sequentially. Each model tries to improve on the error from the previous model. This is different from the bagging technique, where several models are fitted on subsets of the data in a parallel manner. These subsets are usually drawn randomly with replacement. A great example of bagging is in Random Forests.

The boosting process looks like this:

Build an initial model with the data,
Run predictions on the whole data set,
Calculate the error using the predictions and the actual values,
Assign more weight to the incorrect predictions,
Create another model that attempts to fix errors from the last model,
Run predictions on the entire dataset with the new model,
Create several models with each model aiming at correcting the errors generated by the previous one,
Obtain the final model by weighting the mean of all the models.