Gradient descent is an optimization algorithm which is used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost). Gradient descent is best used when the parameters cannot be calculated analytically (e.g. using linear algebra) and must be searched for by an optimization algorithm.
It’s a method for optimizing continuous functions.
It is an iterative method in which one begins at an arbitrary point in the domain, computes the direction of the gradient at that point, and then increments the test point in that direction (or opposite that direction depending on whether you are maximizing or minimizing the function). Repeat until sufficiently close to a local optimum.
Different methods exist for deciding how large of a step to take. A straightforward approach is to do a line search for a local optimum along the step direction and step directly to that local optimum. Another method entails using a fixed sequence of step sizes that decay exponentially.
Gradient descent is famously used to improve machine learning models such as neural networks.