What is Rate Finder?

board-infinity · 15 October 2022 08:51

In machine learning and statistics, the learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function.

The automatic learning rate finder algorithm works like this:

Step 1: We start by defining an upper and lower bound on our learning rate. The lower bound should be very small (1e-10) and the upper bound should be very large (1e+1).
At 1e-10 the learning rate will be too small for our network to learn, while at 1e+1 the learning rate will be too large and our model will overfit.
Both of these are okay, and in fact, that’s what we hope to see!
Step 2: We then start training our network, starting at the lower bound.
After each batch update, we exponentially increase our learning rate.
We log the loss after each batch update as well.
Step 3: Training continues, and therefore the learning rate continues to increase until we hit our maximum learning rate value.
Typically, this entire training process/learning rate increase only takes 1-5 epochs.
Step 4: After training is complete we plot a smoothed loss over time, enabling us to see when the learning rate is both:
Just large enough for loss to decrease
And too large, to the point where loss starts to increase.