Decision trees are notoriously famous for overfitting. Pruning is a regularization method that penalizes the length of the tree, i.e. increases the value of the cost function.
Pruning is of two types:
- Post Pruning(Backward Pruning): Full tree is generated and then the non-significant branches are pruned/removed. Cross-validation is performed at every step to check whether the addition of the new branch leads to an increase in accuracy. If not the branch is converted to the leaf node.
- Pre Pruning(Forward Pruning): This approach stops the non-significant branches from generating. It terminates the generation of a new branch based on the given condition.