Lasso and Ridge regression
In Ridge Regression, the OLS loss function is augmented in such a way that we not only minimize the sum of squared residuals but also penalize the size of parameter estimates, in order to shrink them towards zero:
Solving this for
β^ β^ gives the ridge regression estimates
β^ ridge=(X′ X+λI)−1(X′Y)
β^ridge=(X′X+λI)−1(X′Y), where I denote the identity matrix.
The λ parameter is the regularization penalty.
Ridge regression assumes the predictors are standardized and the response is centered