How can you avoid overfitting?

By using a lot of data overfitting can be avoided, overfitting happens relatively as you have a small
dataset, and you try to learn from it. But if you have a small database and you are forced to come
with a model based on that. In such situation, you can use a technique known as cross validation. In
this method the dataset splits into two section, testing and training datasets, the testing dataset will
only test the model while, in training dataset, the datapoints will come up with the model.
In this technique, a model is usually given a dataset of a known data on which training (training data
set) is run and a dataset of unknown data against which the model is tested. The idea of cross
validation is to define a dataset to “test” the model in the training phase.

Overfitting is a common problem in the machine learning algorithms. When a model learns the noise in the data to the extent that it negatively affects the performance of the model in the new data, it leads to the problem of the overfitting. Overfit models are too complex and fit the data too well.

main-qimg-5c2456c21e58cf206a451ea7ca01750e

In the above diagram, you can see that the green curve tries to fit every data which leads to the overfitting. On the other hand, the black curve perfectly fit the data. It is neither very complex nor very simple.

How to fix overfitting problem

  1. By increasing the number of training examples we can make the machine learning algorithm more generalized.
  2. By reducing the number of input and features we may solve the problem of overfitting. We may remove the features manually or use algorithms which will choose features automatically.
  3. Increasing regularization

Some very simple steps I can suggest for a nuance DL enthusiast:

  1. Add an early stopping rule, that minimizes the validation loss. Make sure to add a patience level, that makes sure that slight fluctuation or noise doesn’t stop learning of the model.
  2. Add in a drop out parameter, typically 0.2 to 0.5. What this means is that some neurons (20% or 50% depending upon the parameter) will not update their weights. These neurons will be different in each step or epoch, and are randomly chosen.

You don’t correct overfitting, but you prevent it. Below are some of the things to do in order to prevent overfitting:

  1. Use regularization methods (l1, l2, dropout, etc.)
  2. Keep complexity of your model at a reasonable level.
  3. Use more data. If applicable, use augmentation.
  4. If the training algorithm is iterative (such as gradient descent), monitor the performance in a validation set and use early stopping when this performance starts to drop (or becomes stable).