What is the difference between stochastic gradient descent (SGD) and gradient descent (GD)?

priyanka-gaikwad-9f6e5281 · 11 August 2020 12:05

SGD and GD

ruble-joseph · 14 August 2020 08:11

Both algorithms are methods for finding a set of parameters that minimize a loss function by evaluating parameters against data and then making adjustments.

In standard gradient descent, you’ll evaluate all training samples for each set of parameters. This is akin to taking big, slow steps toward the solution.

In stochastic gradient descent, you’ll evaluate only 1 training sample for the set of parameters before updating them. This is akin to taking small, quick steps toward the solution.