Advance topic Precision and Recall vs F-score

Usually, precision and recall scores are given together and are not quoted individually. This is because it is easy to vary the sensitivity of a model to improve precision at the expense of recall, or vice versa.

If a single number is required to describe the performance of a model, the most convenient figure is the F-score, which is the harmonic mean
This allows us to combine the precision and recall into a single number.

If we consider either precision or recall to be more important than the other, then we can use the Fβ score, which is a weighted harmonic mean of precision and recall. This is useful, for example, in the case of a medical test, where a false negative may be extremely costly compared to a false positive. The Fβ score formula is more complex:

Calculating Precision and Recall vs. F-score

For the above example of the search engine, we obtained precision of 0.75 and recall of 0.43.

Imagine that we consider precision and recall to be of equal importance for our purposes. In this case, we will use the F-score to summarize precision and recall together.

Putting the figures for the precision and recall into the formula for the F-score, we obtain:

Note that the F-score of 0.55 lies between the recall and precision values (0.43 and 0.75). This illustrates how the F-score can be a convenient way of averaging the precision and recall in order to condense them into a single number.