How can Small Data Sets be Clustered?
dc.contributor.author | Weigand, Anna Christina | |
dc.contributor.author | Lange, Daniel | |
dc.contributor.author | Rauschenberger, Maria | |
dc.contributor.editor | Wienrich, Carolin | |
dc.contributor.editor | Wintersberger, Philipp | |
dc.contributor.editor | Weyers, Benjamin | |
dc.date.accessioned | 2021-09-05T18:56:36Z | |
dc.date.available | 2021-09-05T18:56:36Z | |
dc.date.issued | 2021 | |
dc.description.abstract | In many areas, only small data sets are available and big data does not play a significant role, e.g., in Human-Centered Design research. In the context of machine learning analysis, results of small data sets can be biased due to single variables or missing values. Nevertheless, reliable and interpretable results are essential for determining further actions, such as, e.g., treatments in a health-related use case. In this paper, we explore machine learning clustering algorithms on the basis of a small, health-related (variance) data set about early dyslexia screening. Therefore, we selected three different clustering algorithms from different clustering methods: K-Means, HAC and DBSCAN. In our case, K-Means and HAC showed promising results, while DBSCAN did not deliver distinct results. Based on our experiences, we provide first proposals on how to handle small data set clustering and describe situations in which using Human- Centered Design methods can increase interpretability of machine learning clustering results. Our work represents a starting point for discussing the topic of clustering small data sets. | en |
dc.identifier.doi | 10.18420/muc2021-mci-ws02-284 | |
dc.identifier.uri | https://dl.gi.de/handle/20.500.12116/37373 | |
dc.language.iso | en | |
dc.publisher | Gesellschaft für Informatik e.V. | |
dc.relation.ispartof | Mensch und Computer 2021 - Workshopband | |
dc.relation.ispartofseries | Mensch und Computer | |
dc.subject | machine learning | |
dc.subject | human-centered design | |
dc.subject | interactive systems | |
dc.subject | health | |
dc.subject | small data | |
dc.subject | imbalanced data | |
dc.subject | variances | |
dc.subject | interpretable results | |
dc.subject | guidelines | |
dc.subject | data set | |
dc.subject | clustering | |
dc.title | How can Small Data Sets be Clustered? | en |
dc.type | Text/Workshop Paper | |
gi.citation.publisherPlace | Bonn | |
gi.conference.date | 5.-8. September 2021 | |
gi.conference.location | Ingolstadt | |
gi.conference.sessiontitle | MCI-WS02: UCAI 2021: Workshop on User-Centered Artificial Intelligence | |
gi.document.quality | digidoc |
Dateien
Originalbündel
1 - 1 von 1
Lade...
- Name:
- Contribution_284__a.pdf
- Größe:
- 553.37 KB
- Format:
- Adobe Portable Document Format