Textdokument
How Random is a Classifier given its Area under Curve?
Lade...
Volltext URI
Dokumententyp
Dateien
Zusatzinformation
Datum
2017
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Quelle
Verlag
Gesellschaft für Informatik, Bonn
Zusammenfassung
When the performance of a classifier is empirically evaluated, the Area Under Curve (AUC) is commonly used as a one dimensional performance measure. In general, the focus is on good performance (AUC towards 1). In this paper, we study the other side of the performance spectrum (AUC towards 0.50) as we are interested to which extend a classifier is random given its AUC. We present the exact probability distribution of the AUC of a truely random classifier, given a finite number of distinct genuine and imposter scores. It quantifies the “randomness” of the measured AUC. The distribution involves the restricted partition function, a well studied function in number theory. Although other work exists that considers confidence bounds on the AUC, the novelty is that we do not assume any underlying parametric or non-parametric model or specify an error rate. Also, in cases in which a limited number of scores is available, for example in forensic case work, the exact distribution can deviate from these models. For completeness, we also present an approximation using a normal distribution and confidence bounds on the AUC.