Explain What is a Random Forest? How does it work?

Random forest is a versatile machine learning method capable of performing both regression and classification tasks.

Like bagging and boosting, random forest works by combining a set of other tree models. Random forest builds a tree from a random sample of the columns in the test data.

Here’s are the steps how a random forest creates the trees:

  • Take a sample size from the training data.
  • Begin with a single node.
  • Run the following algorithm, from the start node:
    • If the number of observations is less than node size then stop.
    • Select random variables.
    • Find the variable that does the “best” job splitting the observations.
    • Split the observations into two nodes.
    • Call step a on each of these nodes.