Ordinary Least Squares (OLS)

Regression, basically, means finding the best fit line/curve to your numerical data — a functional approximation of the data. That is, you want a mapping function of your input data to the output data (target). This mapping function is written as:

Ŷ = W*X + B

where B is the intercept and W is the slope of the line and Ŷ is the predicted output. The optimum values of W and B needs to be found to find the best fit line.

Ordinary Least Squares is an analytical solution to this linear regression model. By analytical, it means the exact solution is done by the numerical methods (formulas).

However, OLS can’t be applied in practical examples as it is not scalable to all the algorithms and huge amount of data. Therefore we need to ‘approximate’ this OLS solution by another method which iterates towards the optimal solution slowly.

Ordinary Least Squares. It is the best unbiased linear estimator for the statistics of the linear regression. There are other methods as well, which sometimes work as good as OLS, but they have different drawbacks.

The core of OLS lies in minimizing the sum of (y - bX)^2, whereas y and b are vectors and X is a matrix of regressors. Note that both Y and X are known variables (measured observations, given data, whatever you want to call them, they are known to you and that’s my point). b is the unknown variable and by minimizing this sum of squares, you are minimizing the errors of the model: y = bX + error.