perfInd.Rd
Compute several indicators to describe the performance of a binary classifier, e.g. sensitivity, specificity, an estimate of the area under the receiver operating characteristic, the Gini coefficient etc. (see the return value).
perfInd(x, y = NULL, negativeFirst = TRUE, debug = FALSE)
x | a 2x2 classification table having predictions in rows, and
ground truth in columns (negative cases come first, by default, see
the |
---|---|
y | if |
negativeFirst | if TRUE, negative cases come first in both
rows and columns of |
debug | if TRUE, debugs will be printed. If numeric of value greater than 1, verbose debugs will be produced. |
a list containing the following named entries:
classification table
true positives
true negatives
false negatives
false positives
sensitivity
specificity
negative predicted value, i.e. true negatives over true negatives and false positives
positive predicted value, i.e. true positives over true positives and false negatives
weighted negative predicted value (an obscure measure)
weighted positive predicted value
false positive rate
false negative rate
false discovery rate
accuracy
F1 score (2*TP/(2*TP+FN+FP))
F2 score (5*TP/(5*TP+4*FN+FP))
F0.5 score (1.25*TP/(1.25*TP+0.25*FN+FP))
correspondence score (TP/(TP+FN+FP))
Matthews correlation coefficient
informedness
markedness
AUC (area under the ROC curve estimated by interpolating the (0,0), (1-specificity, sensitivity) and (1, 1) points in the ROC space)
the Gini index
number of observations classified << end
Fawcett, Tom (2006). _An Introduction to ROC Analysis_. Pattern Recognition Letters 27 (8): 861874. doi:10.1016/j.patrec.2005.10.010. Powers, David M W (2011). _Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation._ Journal of Machine Learning Technologies ISSN: 2229-3981 & ISSN: 2229-399X, Volume 2, Issue 1, 2011, pp-37-63 Available online at http://www.bioinfo.in/contents.php?id=51
performance
which gives many more measures
# example from https://en.wikipedia.org/w/index.php?title=Sensitivity_and_specificity&oldid=680316530 x<-matrix(c(20,10,180,1820),2) print(x)#> [,1] [,2] #> [1,] 20 180 #> [2,] 10 1820perfInd(x)#> $table #> A B #> A 20 180 #> B 10 1820 #> #> $tp #> [1] 1820 #> #> $tn #> [1] 20 #> #> $fn #> [1] 180 #> #> $fp #> [1] 10 #> #> $sensitivity #> [1] 0.91 #> #> $specificity #> [1] 0.6666667 #> #> $npv #> [1] 0.1 #> #> $ppv #> [1] 0.9945355 #> #> $wnpv #> [1] 0.001477833 #> #> $wppv #> [1] 0.9798379 #> #> $fpr #> [1] 0.3333333 #> #> $fnr #> [1] 0.09 #> #> $fdr #> [1] 0.005464481 #> #> $accuracy #> [1] 0.9064039 #> #> $f1 #> [1] 0.9503916 #> #> $f2 #> [1] 0.9257375 #> #> $f05 #> [1] 0.9763948 #> #> $correspondence #> [1] 0.9054726 #> #> $mcc #> [1] 0.2334855 #> #> $informedness #> [1] 0.5766667 #> #> $markedness #> [1] 0.09453552 #> #> $auc #> [1] 0.7883333 #> #> $gini #> [1] 0.5766667 #> #> $n #> [1] 2030 #># compute perormance over a vector of predicted and true classification: print(perfInd(c(0,0,1,1,1), c(0,0,0,1,1)))#> $table #> y #> x 0 1 #> 0 2 0 #> 1 1 2 #> #> $tp #> [1] 2 #> #> $tn #> [1] 2 #> #> $fn #> [1] 0 #> #> $fp #> [1] 1 #> #> $sensitivity #> [1] 1 #> #> $specificity #> [1] 0.6666667 #> #> $npv #> [1] 1 #> #> $ppv #> [1] 0.6666667 #> #> $wnpv #> [1] 0.6 #> #> $wppv #> [1] 0.2666667 #> #> $fpr #> [1] 0.3333333 #> #> $fnr #> [1] 0 #> #> $fdr #> [1] 0.3333333 #> #> $accuracy #> [1] 0.8 #> #> $f1 #> [1] 0.8 #> #> $f2 #> [1] 0.9090909 #> #> $f05 #> [1] 0.7142857 #> #> $correspondence #> [1] 0.6666667 #> #> $mcc #> [1] 0.6666667 #> #> $informedness #> [1] 0.6666667 #> #> $markedness #> [1] 0.6666667 #> #> $auc #> [1] 0.8333333 #> #> $gini #> [1] 0.6666667 #> #> $n #> [1] 5 #># compare several measures over several classification results: tbls<-list(matrix(c(98,2,2,8),2),matrix(c(8,2,2,8),2),matrix(c(80,20,2,8),2)) for (i in 1:length(tbls)) { m<-tbls[[i]] .pn(m) pi<-perfInd(m) with(pi,catnl( 'acc: ',accuracy, ', sp+se: ',sensitivity+specificity, ', correspondence: ',correspondence, ', F1: ',f1, ', F2: ',f2,sep='')) }#> m #> [,1] [,2] #> [1,] 98 2 #> [2,] 2 8 #> acc: 0.9636364, sp+se: 1.78, correspondence: 0.6666667, F1: 0.8, F2: 0.8 #> m #> [,1] [,2] #> [1,] 8 2 #> [2,] 2 8 #> acc: 0.8, sp+se: 1.6, correspondence: 0.6666667, F1: 0.8, F2: 0.8 #> m #> [,1] [,2] #> [1,] 80 2 #> [2,] 20 8 #> acc: 0.8, sp+se: 1.6, correspondence: 0.2666667, F1: 0.4210526, F2: 0.5882353# make degenerated table proper 2x2 table i1<-c(1,1,1,1) i2<-c(1,1,2,2) m<-table(i1==1,i2==1) print(m) # this is 1x2 table#> #> FALSE TRUE #> TRUE 2 2perfInd(m)$table # this is 2x2 table#> #> FALSE TRUE #> FALSE 0 0 #> TRUE 2 2