Recommendations to Handle Health-related Small Imbalanced Data in Machine Learning
Vorschaubild nicht verfügbar
ISSN der Zeitschrift
Mensch und Computer 2020 - Workshopband
MCI-WS02: UCAI 2020: Workshop on User-Centered Artificial Intelligence
Gesellschaft für Informatik e.V.
When discussing interpretable machine learning results, researchers need to compare results and reflect on reliable results, especially for health-related data. The reason is the negative impact of wrong results on a person, such as in missing early screening of dyslexia or wrong prediction of cancer. We present nine criteria that help avoiding over-fitting and biased interpretation of results when having small imbalanced data related to health. We present a use case of early screening of dyslexia with an imbalanced data set using machine learning classification to explain design decisions and discuss issues for further research.