What is the difference between bagging & random forest?

Difference between bagging & random forest

The fundamental difference between bagging and the random forest is that in Random forests, only a subset of features is selected at random out of the total and the best split feature from the subset is used to split each node in a tree, unlike in bagging where all features are considered for splitting a node.

They are both approaches to dealing with the same problem: a single decision tree has high variance (can be very sensitive to the characteristics of the training set).

Both solve the problem by generating multiple trees and averaging them.

Bagging solves it by subsampling the training data.

Random forests solves it by subsampling the attributes considered for splitting on.

They’re elegant mirrors of one another: if you think of your training set as a big matrix, one is reducing variance by shuffling the rows, and the other is reducing variance by shuffling the columns (this is a simplification, but you get the idea).