Konferenzbeitrag
On the Subjectivity of Emotions in Software Projects: How Reliable are Pre-Labeled Data Sets for Sentiment Analysis? (Summary)
Vorschaubild nicht verfügbar
Volltext URI
Dokumententyp
Text/Conference Paper
Dateien
Zusatzinformation
Datum
2023
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Quelle
Verlag
Gesellschaft für Informatik e.V.
Zusammenfassung
Social aspects (e.g., the sentiment of developers) are important for software development. In order to automatically analyze sentiments, sentiment analysis tools use machine learning methods that require data sets labeled according to emotion or polarity. As these labeled data sets strongly influence the tools’ accuracy, we investigate whether the labels match developers’ perceptions. For this purpose, we conducted an international survey with 94 participants who labeled 100 statements. We compare the median as well as every single participant’s perception with the labels. The results show that the median perception of all participants coincides with the predefined labels for 62.5% of the statements, and that the difference between the single participant’s ratings and the labels is even worse. This summary refers to the paper with the title “On the subjectivity of emotions in software projects: How reliable are pre-labeled data sets for sentiment analysis?” [He22b]. It was published in the Journal of Systems and Software (JSS) in 2022 peer-reviewed.