Logo des Repositoriums
 

How can Small Data Sets be Clustered?

dc.contributor.authorWeigand, Anna Christina
dc.contributor.authorLange, Daniel
dc.contributor.authorRauschenberger, Maria
dc.contributor.editorWienrich, Carolin
dc.contributor.editorWintersberger, Philipp
dc.contributor.editorWeyers, Benjamin
dc.date.accessioned2021-09-05T18:56:36Z
dc.date.available2021-09-05T18:56:36Z
dc.date.issued2021
dc.description.abstractIn many areas, only small data sets are available and big data does not play a significant role, e.g., in Human-Centered Design research. In the context of machine learning analysis, results of small data sets can be biased due to single variables or missing values. Nevertheless, reliable and interpretable results are essential for determining further actions, such as, e.g., treatments in a health-related use case. In this paper, we explore machine learning clustering algorithms on the basis of a small, health-related (variance) data set about early dyslexia screening. Therefore, we selected three different clustering algorithms from different clustering methods: K-Means, HAC and DBSCAN. In our case, K-Means and HAC showed promising results, while DBSCAN did not deliver distinct results. Based on our experiences, we provide first proposals on how to handle small data set clustering and describe situations in which using Human- Centered Design methods can increase interpretability of machine learning clustering results. Our work represents a starting point for discussing the topic of clustering small data sets.en
dc.identifier.doi10.18420/muc2021-mci-ws02-284
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/37373
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofMensch und Computer 2021 - Workshopband
dc.relation.ispartofseriesMensch und Computer
dc.subjectmachine learning
dc.subjecthuman-centered design
dc.subjectinteractive systems
dc.subjecthealth
dc.subjectsmall data
dc.subjectimbalanced data
dc.subjectvariances
dc.subjectinterpretable results
dc.subjectguidelines
dc.subjectdata set
dc.subjectclustering
dc.titleHow can Small Data Sets be Clustered?en
dc.typeText/Workshop Paper
gi.citation.publisherPlaceBonn
gi.conference.date5.-8. September 2021
gi.conference.locationIngolstadt
gi.conference.sessiontitleMCI-WS02: UCAI 2021: Workshop on User-Centered Artificial Intelligence
gi.document.qualitydigidoc

Dateien

Originalbündel
1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
Contribution_284__a.pdf
Größe:
553.37 KB
Format:
Adobe Portable Document Format