Explain the Gradient Descent algorithm with respect to linear regression

Gradient descent is a first-order optimization algorithm. In linear regression, this algorithm is used to optimize the cost function to find the values of the βs (estimators) corresponding to the optimized value of the cost function.

The working of Gradient descent is similar to a ball that rolls down a graph (ignoring the inertia). In that case, the ball moves along the direction of the maximum gradient and comes to rest at the flat surface i.e, corresponds to minima.

Now, let’s understand it mathematically:

Mathematically, the main objective of the gradient descent for linear regression is to find the solution of the following expression,

ArgMin J(θ0, θ1), where J(θ0, θ1) represents the cost function of the linear regression. It is given by :

Here, h is the linear hypothesis model, defined as h=θ0 + θ1x,

y is the target column or output, and m is the number of data points in the training set.

Steps of Gradient Descent Algorithm:

Step-1: Gradient Descent starts with a random solution,

Step-2: Based on the direction of the gradient, the solution is updated to the new value where the cost function has a lower value.

The updated value for the parameter is given by the formulae:

Repeat until convergence(upto minimum loss function)