Beyer, StefanieMacho, ChristianDi Penta, MassimilianoPinzger, MartinKoziolek, AnneSchaefer, InaSeidl, Christoph2020-12-172020-12-172021978-3-88579-704-3https://dl.gi.de/handle/20.500.12116/34525This paper has been published in the Journal Empirical Software Engineering, 2020. Stack Overflow (SO) is among the most popular question and answers sites used by developers. Labeling posts with tags is one of the features to facilitate searching and browsing SO posts. However, existing tags mainly refer to technological aspects but not to the purpose of a question. In this paper, we argue that tagging posts with their purpose can facilitate developers to find the posts that provide an answer to their question. We first present a harmonization of existing taxonomies of question categories, that represent the purpose of a question, into seven categories. Next, we present two approaches to automate the classification of posts into the seven question categories, one using regular expressions and one using machine learning. Evaluating both approaches on an independent test set, we found that our regular expressions outperform machine learning. Applying the regular expressions on posts related to Android app development, showed that the categories API USAGE, CONCEPTUAL, and DISCREPANCY are most frequently assigned. By integrating our approach into SO, posts could be manually tagged with our categories which would allow developers to search posts by question category.enStack OverflowClassificationQuestion CategoriesProgram UnderstandingWhat Kind of Questions Do Developers Ask on Stack Overflow? A Comparison of Automated Approaches to Classify Posts Into Question CategoriesText/ConferencePaper10.18420/SE2021_031617-5468