How to deal with the unbalanced binary classification?

There are a number of ways to handle unbalanced binary classification (assuming that you want to identify the minority class):

  • First, you want to reconsider the metrics that you’d use to evaluate your model. The accuracy of your model might not be the best metric to look at because and I’ll use an example to explain why. Let’s say 99 bank withdrawals were not fraudulent and 1 withdrawal was. If your model simply classified every instance as “not fraudulent”, it would have an accuracy of 99%! Therefore, you may want to consider using metrics like precision and recall.
  • Another method to improve unbalanced binary classification is by increasing the cost of misclassifying the minority class. By increasing the penalty of such, the model should classify the minority class more accurately.
  • Lastly, you can improve the balance of classes by oversampling the minority class or by undersampling the majority class.