One hot encoding will create exact number of features as the number of categories, as yes or no responses to those categories. For e.g. if there are 4 categories, it’s clear that if some record doesn’t correspond to any of the 3 categories, then the 4th category is applicable to that record.
Hence for n categories, the last one feature will be 100, linearly dependent on the rest n-1 features.
Now why do we remove highly correlated features or redundant features, is not to improve the results, but to gain better model interpretability.