Recommendations to Handle Health-related Small Imbalanced Data in Machine Learning

When discussing interpretable machine learning results, researchers need to compare results and reflect on reliable results, especially for health-related data. The reason is the negative impact of wrong results on a person, such as in missing early screening of dyslexia or wrong prediction of cancer. We present nine criteria that help avoiding over-fitting and biased interpretation of results when having small imbalanced data related to health. We present a use case of early screening of dyslexia with an imbalanced data set using machine learning classification to explain design decisions and discuss issues for further research.

Rauschenberger, Maria; Baeza-Yates, Ricardo (2020): Recommendations to Handle Health-related Small Imbalanced Data in Machine Learning. Mensch und Computer 2020 - Workshopband. DOI: 10.18420/muc2020-ws111-333. Bonn: Gesellschaft für Informatik e.V.. MCI-WS02: UCAI 2020: Workshop on User-Centered Artificial Intelligence. Magdeburg. 6.-9. September 2020

Schlagwörter

Machine Learning , Human-Centered Design , HCD , interactive systems , health , small data , imbalanced data , over-fitting , variances , interpretable results , guidelines.

DOI

10.18420/muc2020-ws111-333

Sammlungen

Workshopband MuC 2020

Komplettanzeige

Recommendations to Handle Health-related Small Imbalanced Data in Machine Learning

Volltext URI

Dokumententyp

Dateien

Zusatzinformation

Datum

Autor:innen

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Quelle

Verlag

Zusammenfassung

Beschreibung

Schlagwörter

Zitierform

DOI

Tags

Sammlungen