Indicators for classification models
Notations : TP (True Positives), TN (True Negatives), FP (False Positives) et FN (False Negatives).
- Accuracy : The accuracy is the ratio (TP+TN)/(TP+TN+FP+FN). The closer it is to 1, better is the test.
- Precision : Precision is the ratio TP/(TP + FP). It corresponds to the proportion of positive predictions that are actually correct. In other words, a model with an accuracy of 0.8 correctly predicts the positive class in 80% of the cases.
- Balanced accuracy (binary case only) : Balanced accuracy is an indicator used to evaluate the quality of a binary classifier. It’s specially useful when the classes are unbalanced, i.e. one of the two classes appears more often than the other. It is calculated as follows: (Sensitivity + Specificity) / 2.
- False Positive Rate (binary case only) : Proportion of negative cases that the test detects as positive (FPR = 1-Specificity).
- False Negative Rate (binary case only) : Proportion of positive cases that the test detects as negative (FNR = 1-Sensitivity)
- Correct classification : number of well-classified observations.
- Misclassification : number of misclassified observations.
- Prevalence: Relative frequency of the event of interest in the total sample (TP+FN)/N.
- F-measure : The F-measure also called F-score or F1-score can be interpreted as a weighted average of precision and recall or sensitivity. Its value is between 0 and 1. It is defined by : F-measure = 2 * (Precision * Sensibility) / (Precision + Sensibility).
- NER (null error rate) : it corresponds to the percentage of error that would be observed if the model always predicted the majority class.
- Cohen Kappa : it is useful when we want to study the relationship between the response variable and the predictions. The value of Kappa is between 0 and 1. A value of 1 means that there is a total link between the two variables (perfect classification).
- Cramer’s V : The Cramer’s V test compares the degree of linkage between the two variables studied. The closer V is to zero, the less dependent the variables studied are.
- MCC (Matthews correlation coefficient) : The Matthews correlation coefficient (MCC) or phi coefficient is used in machine learning as a measure of the quality of binary (two-class) classifications, introduced by biochemist Brian W. Matthews in 1975. The MCC is defined identically to Pearson’s phi coefficient.
- Roc curve : The ROC curve (Receiver Operating Characteristics ) displays the performance of a model and enables a comparison to be made with other models. The terms used come from signal detection theory. The curve of points (1-specificity, sensitivity) is the ROC curve.
- AUC : The area under the curve (AUC) is a synthetic index calculated for ROC curves. The AUC is the probability that a positive event is classified as positive by the test, given all possible values of the test.
- Lift curve : The Lift curve is the curve that represents the Lift value as a function of the percentage of the population. Lift is the ratio between the proportion of true positives and the proportion of positive predictions.
- Cumulative gain curve : The gain curve represents the sensitivity, or recall, as a function of the percentage of the total population. It allows us to see which portion of the data concentrates the maximum number of positive events.