What is regularisation? Why is it useful?

What is regularisation? Why is it useful?

To solve the problem of overfitting inour model we need to increase flexibility of our model. But too much of his flexibility can also spoil our model, so flexibility shold such that it is optimal value. To increase flexibility we can use regularization technique.

They are three types of regularization technique to overcome overfitting.

a) L1 regularization (also called Lasso regularization / panelization.)

ref:https://medium.com/greyatom/what-is-underfitting-and-overfitting-in-machine-learning-and-how-to-deal-with-it-6803a989c76

b) L2 regularization (also called Ridege regularization/ penalization.)

c) Elastic net

Let’s see detailed explanation of each one by one.

a) L1 regularization :

It stand for” least absolute shrinkage and and selection operator”.

Mathematical representation of it can be given as,

Image for post

Image for post

Things we need to reduce the overfitting of data, the ‘P’ term should be added to our existing model and alpha is learning rate. Lasso method overcome the disadvantage of Ridge regression by not furnishing high value of the coefficient beta but actually setting them to 0 you they are not relavent, therefore you might end with fewer features including the model you started with, which is the huge advantage.

Image for post

Limitation of Lasso:

i) If p>n, the lasso selects at most n variables. The number of selected genes is bounded by the number of samples. Where ‘p’ is number of data and ’n’ are train and test samples.

ii) Grouped variables: the lasso fails to do grouped selection. It tends to select one variable from a group and ignore the others.

To get more insight of L1 regulation(Lasso regression) click here.

b) l2 regularization :

It is also known as ‘Tikhnov regularization’.

Mathematical it is given as,

Image for post

The ‘P’ is the regularization added to cost function.The importance of this regularization is such that it enforces the Beta function to be lower, but it does not enforces then to zero. That is that is it will not get rid of features which are not suitable but rather minimize the impact on the trained model.

Image for post

3. Elastic:

Working of Elastic regularization is such that it work by implementing both i.e Lasso and Ridge together. Hence, it work more better then both individual.

The elastic net solution path is piece wise linear. The mathematical representation is given as,

Image for post

Image for post

The advantage of Elastic regularization is that it overcome limitation of both Lasso and Ridge to some extend.

Regularizations are techniques used to reduce the error by fitting a function appropriately on the given training set and avoid overfitting.
ref:https://towardsdatascience.com/regularization-an-important-concept-in-machine-learning-5891628907ea

Regularisation utilizes your knowledge of your problem to improve the solution. It reduces solution space by adding new assumption about your data/problem.

In general, your exact solution is in a Big space B1. If you working with a particular issue/data, then your solution would line in a smaller space B2. Then utilize utilize that knowledge via regularisation will give better solution, faster (some time) etc.

For example, of image denoising. You want your final output should not much different to your noisy image → you have the data fidelity term.

Then you know that the two nearby pixels should be very similar (smoothness property). → Tikhonov, Total variation, etc. regularisation.