What is the difference between back-propagation and forward-propagation?

Forward pass

Propagating the computations of all neurons within all layers moving from left to right. This starts with the feeding of your feature vector(s)/tensors into the input layer, and ends with the final prediction generated by the output layer. Forward pass computations occur during training in order to evaluate the objective/loss function under the current network parameter settings in each iteration, as well as during inference (prediction after training) when applied to new/unseen data.

Backward pass

Known as back-propagation, or “backprop”, this is a step executed during training in order to compute the objective/loss function gradient with respect to the network’s parameters for updating them during a single iteration of some form of gradient descent (Adam, RMSProp, etc.). It is named as such because, when viewing a neural network as a computation graph, it starts by computing objective/loss function derivatives at the output layer, and propagates them back towards the input layer (effectively, this is the chain rule from Calculus in action) in order to compute derivatives for, and make updates to, all parameters in all layers.

In neural networks, you forward propagate to get the output and compare it with the real value to get the error.

Now, to minimize the error, you propagate backwards by finding the derivative of error with respect to each weight and then subtracting this value from the weight value.

The basic learning that has to be done in neural networks is training neurons when to get activated. Each neuron should activate only for particular type of inputs and not all inputs. Therefore, by propagating forward you see how well your neural network is behaving and find the error. After you find out that your network has error, you back propagate and use a form of gradient descent to update new values of weights. Then, you will again forward propagate to see how well those weights are performing and then will backward propagate to update the weights. This will go on until you reach some minima for error value.