F1 scores portray a better picture than just accuracy as they take into account both the Precision and the Recall value.👁️🗨️
However, F1 scores come with a big DISCLAIMER which you should know about. Let’s take a look:
Consider a problem of binary classification with classes A and B. Now, it might be that predicting class A as B will be much more critical and costly than predicting class B as A.
For example, predicting a fraud transaction as non fraud will cause far more money to the bank than predicting a non fraud transaction as fraud. 👁️🗨️
In such a case, your focus will be “more” on optimizing how well the model is predicting fraud transactions accurately than predicting non fraud ones accurately.
This is the weight or the importance difference that F1 score doesn’t take into account. So, simply optimizing for a high F1 score will be misleading and will portray an incomplete picture.