What is Differential Learning Rate?

board-infinity · 15 October 2022 09:03

Differential Learning rates mean higher layers change more than deeper layers during training. Building deep learning models on top of pre-existing architectures is a proven method to generate much better results in computer vision tasks.

Depending on the task at hand and resources one should choose an appropriate approach in transfer learning. In general, if we have a good amount of data and resources, transfer learning with differential learning rates will yield better results.

#inital train fully connected layers using single learning rate 
lr = 0.2 
learn.fit(lr, 3, cycle_len=1, cycle_mult=2) 
#use differential learning rate, across layers 
lrs = np.array([lr/9,lr/3,lr]) 
learn.unfreeze() 
learn.fit(lrs, 3, cycle_len=1, cycle_mult=2)