How do Random Forests select the Important Features?

Sometimes random forests can also be used to determine the importance of variables i.e, rank in a regression or classification problem.

The factors that are used to find the rank of the variables are as follows:

  • Mean Decrease Accuracy: If we drop that variable, how much the model accuracy decreases.
  • Mean Decrease Gini: This measure of variable importance is used for the calculation of splits in trees based on the Gini impurity index.

Conclusion: The higher the value of mean decrease accuracy or mean decrease Gini score, the higher the importance of the variable in the model.