How can Small Data Sets be Clustered?

Weigand, Anna Christina; Lange, Daniel; Rauschenberger, Maria

How can Small Data Sets be Clustered?

dc.contributor.author	Weigand, Anna Christina
dc.contributor.author	Lange, Daniel
dc.contributor.author	Rauschenberger, Maria
dc.contributor.editor	Wienrich, Carolin
dc.contributor.editor	Wintersberger, Philipp
dc.contributor.editor	Weyers, Benjamin
dc.date.accessioned	2021-09-05T18:56:36Z
dc.date.available	2021-09-05T18:56:36Z
dc.date.issued	2021
dc.description.abstract	In many areas, only small data sets are available and big data does not play a significant role, e.g., in Human-Centered Design research. In the context of machine learning analysis, results of small data sets can be biased due to single variables or missing values. Nevertheless, reliable and interpretable results are essential for determining further actions, such as, e.g., treatments in a health-related use case. In this paper, we explore machine learning clustering algorithms on the basis of a small, health-related (variance) data set about early dyslexia screening. Therefore, we selected three different clustering algorithms from different clustering methods: K-Means, HAC and DBSCAN. In our case, K-Means and HAC showed promising results, while DBSCAN did not deliver distinct results. Based on our experiences, we provide first proposals on how to handle small data set clustering and describe situations in which using Human- Centered Design methods can increase interpretability of machine learning clustering results. Our work represents a starting point for discussing the topic of clustering small data sets.	en
dc.identifier.doi	10.18420/muc2021-mci-ws02-284
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/37373
dc.language.iso	en
dc.publisher	Gesellschaft für Informatik e.V.
dc.relation.ispartof	Mensch und Computer 2021 - Workshopband
dc.relation.ispartofseries	Mensch und Computer
dc.subject	machine learning
dc.subject	human-centered design
dc.subject	interactive systems
dc.subject	health
dc.subject	small data
dc.subject	imbalanced data
dc.subject	variances
dc.subject	interpretable results
dc.subject	guidelines
dc.subject	data set
dc.subject	clustering
dc.title	How can Small Data Sets be Clustered?	en
dc.type	Text/Workshop Paper
gi.citation.publisherPlace	Bonn
gi.conference.date	5.-8. September 2021
gi.conference.location	Ingolstadt
gi.conference.sessiontitle	MCI-WS02: UCAI 2021: Workshop on User-Centered Artificial Intelligence
gi.document.quality	digidoc

Dateien

Originalbündel

1 - 1 von 1

Name:: Contribution_284__a.pdf
Größe:: 553.37 KB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

Workshopband MuC 2021