What is Focal Loss in Object Detection?

Focal Loss (FL) is an improved version of Cross-Entropy Loss (CE) that tries to handle the class imbalance problem by assigning more weights to hard or easily misclassified examples (i.e. background with noisy texture or partial object or the object of our interest ) and to down-weight easy examples (i.e. Background objects).

Why Focal Loss Needed?

Both classic one stage detection methods, like boosted detectors, DPM & more recent methods like SSD evaluate almost 104 to 105 candidate locations per image but only a few locations contain objects (i.e. Foreground) and rest are just background objects. This leads to the class imbalance problem.

This imbalance causes two problems –

  1. Training is inefficient as most locations are easy negatives (meaning that they can be easily classified by the detector as background) that contribute no useful learning.
  2. Since easy negatives (detections with high probabilities) account for a large portion of inputs. Although they result in small loss values individually but collectively, they can overwhelm the loss & computed gradients and can lead to degenerated models.