Why is it necessary to avoid dummy variable trap

rajanikant-ghate · 20 June 2021 04:57

One hot encoding will create exact number of features as the number of categories, as yes or no responses to those categories. For e.g. if there are 4 categories, it’s clear that if some record doesn’t correspond to any of the 3 categories, then the 4th category is applicable to that record.

Hence for n categories, the last one feature will be 100, linearly dependent on the rest n-1 features.

Now why do we remove highly correlated features or redundant features, is not to improve the results, but to gain better model interpretability.