What is overfitting?

What is overfitting?

In simpler terms, overfitting refers to too much learning. Underfitting refers to less learning. The model should be generalized so that it is able to understand and make predictions for any data apart from test data.

Imagine taking a multiple-choice exam. Let’s say you got 70% of the answers correct the first time. After the exam, you study the questions you got wrong and memorize the right answers. While the memory is still fresh, you take the same exam again. This time, you got much higher - say 95% questions answered correctly. You repeat the same procedure again and again, and on the 4th attempt, you score the maximum 100%.

Now, take a new test on the same subject, with different questions. Chances are, you are most likely to score closer to 70% again than nail the test perfectly from the first take. This is a prime example of overfitting - you learned the specifics of one set of questions, but this will contribute little in understanding the subject matter, especially with topics you have not encountered before.