Auflistung nach Schlagwort "natural language processing"
1 - 9 von 9
Treffer pro Seite
Sortieroptionen
- TextdokumentCitcom – Citation Recommendation(INFORMATIK 2020, 2021) Meyer, Melina; Frey, Jenny; Laub, Tamino; Wrzalik, Marco; Krechel, DirkCitation recommendation aims to predict references based on a given text. In this paper, we focus on predicting references using small passages instead of a whole document. Besides using a search engine as baseline, we introduce two further more advanced approaches that are based on neural networks. The first one aims to learn an alignment between a passage encoder and reference embeddings while using a feature engineering approach including a simple feed forward network. The second model takes advantage of BERT, a state-of-the-art language representation model, to generate context-sensitive passage embeddings. The predictions of the second model are based on inter-passage similarities between the given text and indexed sentences, each associated with a set of references. For training and evaluation of our models, we prepare a large dataset consisting of English papers from various scientific disciplines.
- TextdokumentA Conversational Agent for the Improvement of Human Reasoning Skills(Bildungsräume 2017, 2017) Wartschinski, Laura; Le, Nguyen-Thinh; Pinkwart, NielsHuman reasoning is the ability to make sound and goal-oriented decisions and is therefore highly relevant in daily life. However, its importance has not yet been addressed explicitly in education. In this paper, we propose to develop a computer-based conversational agent named Liza for improving human reasoning skills. Liza is able to hold conversations with humans to help them solve a small selection of well-studied reasoning problems. Such a conversational agent allows a much more scalable and inexpensive approach for teaching at least basic reasoning principles. The evaluation study shows that the conversational agent Liza improved the reasoning skills of the participants, who had conversations with the agent to solve reasoning problems and that the group using Liza achieved much better learning effects than a group studying with a non-interactive online course that was implemented as a control condition.
- KonferenzbeitragA Corpus of Memes from Reddit: Acquisition, Preparation and First Case Studies(INFORMATIK 2023 - Designing Futures: Zukünfte gestalten, 2023) Schmidt, Thomas; Schiller, Fabian; Götz, Matthias; Wolff, ChristianWe present a corpus of memes and their textual components that were acquired from the popular meme platform r\memes, a subreddit of Reddit and one of the major outlets of online meme culture. The corpus consists of the most popular memes from 2013-2021 on the platform and we acquired 11,701 memes and 280,351 text tokens. We conduct several case studies focused on diachronic analysis to highlight the possibilities of the corpus for research in internet studies and online culture. We examine the general activity on the platform throughout the years and identify a significant increase in meme production beginning 2017. Results of sentiment analysis show a tendency towards memes with positively classified texts. The analysis of most frequent words per half-year spotlights the importance of certain cultural events for meme culture (e.g. the 2016 US election). Using the LIWC to analyze swear and sexual words shows an overall decrease in the usage of these words pointing to an increased moderation of the platform. The corpus is publicly available for the research community for further studies.
- KonferenzbeitragMitlernender Assistent für Senior*innen zur Erstellung von Beiträgen in einer Community Anwendung(Mensch und Computer 2020 - Tagungsband, 2020) Goram, MandyPartizipation und Teilhabe am gesellschaftlichen Leben sowie sozialer Austausch und gemeinsames Miteinander finden heutzutage häufig im Internet statt. Die Community App meinDorf55+ soll Senior*innen unterstützten die soziale Teilhabe zu stärken. Es hat sich gezeigt, dass Senior*innen Schwierigkeiten bei der Erstellung von Beiträgen haben. Kurze, inhaltsarme Texte und fehlende Informationen in den Beiträgen erschweren es Interessenten nach passenden Beiträgen zu suchen. Die Frage ist, wie die Senior*innen bei der Erstellung von Beiträgen, für die App, gezielt unterstützt werden können. Der Artikel beschreibt ein mitlernendes Assistenzsystem, das Senior*innen bei der Erfassung von Beiträgen unterstützen soll. Durch maschinelles Lernen werden Beiträge inhaltlich analysiert und Senior*innen gezielt auf wichtige noch fehlende Angaben im Text hingewiesen, um sie dazu zu animieren ihre Texte zu prüfen.
- KonferenzbeitragSharing and Exploiting Requirement Decisions(Softwaretechnik-Trends Band 40, Heft 1, 2020) Kleebaum, Anja; Johanssen, Jan Ole; Paech, Barbara; Bruegge, BerndContinuous software engineering is an agile development process that puts particular emphasis on the incremental implementation of requirements and their rapid validation through user feedback. This involves frequent and incremental decision making, which needs to be shared within the team. Requirements engineers and developers have to share their decision knowledge since the decisions made are particularly important for future requirements. It has been a vision for long that important decision knowledge gets documented and shared. However, several reasons hinder requirements engineers and developers from doing this, for example, the intrusiveness and overhead of the documentation. With ConDec, we develop tool support for the continuous management of decision knowledge that uses techniques for natural language processing and integrates into tools that developers often use, for example, into the issue tracking system Jira. In this work, we focus on how ConDec enables requirements engineers and developers to share and exploit decision knowledge regarding requirements. We evaluate ConDec in student projects and develop techniques to teach decision knowledge management.
- KonferenzbeitragA Simple NLP-based Approach to Support Onboarding and Retention in Open Source Communities(Software Engineering and Software Management 2019, 2019) Stanik, Christoph; Montgomery, Lloyd; Martens, Daniel; Fucci, Davide; Maalej, WalidSuccessful open source communities are constantly looking for new members and helping them become active developers. A common approach for developer onboarding in open source projects is to let newcomers focus on relevant yet easy-to-solve issues to familiarize themselves with the code and the community. The goal of this research is twofold. First, we aim at automatically identifying issues that newcomers can resolve by analyzing the history of resolved issues by simply using the title and description of issues. Second, we aim at automatically identifying issues, that can be resolved by newcomers who later become active developers. We mined the issue trackers of three large open source projects and extracted natural language features from the title and description of resolved issues. In a series of experiments, we optimized and compared the accuracy of four supervised classifiers to address our research goals. Random Forest, achieved up to 91% precision (F1-score 72%) towards the first goal while for the second goal, Decision Tree achieved a precision of 92% (F1-score 91%). A qualitative evaluation gave insights on what information in the issue description is helpful for newcomers. Our approach can be used to automatically identify, label, and recommend issues for newcomers in open source software projects based only on the text of the issues.
- ConferencePaperSpecmate: Automated Creation of Test Cases from Acceptance Criteria(Software Engineering 2021, 2021) Fischbach, Jannik; Vogelsang, Andreas; Spies, Dominik; Wehrle, Andreas; Junker, Maximilian; Freudenstein, DietmarWe summarize the paper Specmate: Automated Creation of Test Cases from Acceptance Criteria, which was presented at the 2020 edition of the IEEE International Conference on Software Testing, Verification and Validation (ICST).
- KonferenzbeitragWhat Does My Classifier Learn? A Visual Approach to Understanding Natural Language Text Classifiers(Software Engineering und Software Management 2018, 2018) Winkler, Jonas Paul; Vogelsang, AndreasNeural Networks have been utilized to solve various tasks such as image recognition, text classification, and machine translation and have achieved exceptional results in many of these tasks. However, understanding the inner workings of neural networks and explaining why a certain output is produced are no trivial tasks. Especially when dealing with text classification problems, an approach to explain network decisions may greatly increase the acceptance of neural network supported tools. In this paper, we present an approach to visualize reasons why a classification outcome is produced by convolutional neural networks by tracing back decisions made by the network. The approach is applied to various text classification problems, including our own requirements engineering related classification problem. We argue that by providing these explanations in neural network supported tools, users will use such tools with more confidence and also may allow the tool to do certain tasks automatically.
- KonferenzbeitragWhat If We Encoded Words as Matrices and Used Matrix Multiplication as Composition Function?(INFORMATIK 2019: 50 Jahre Gesellschaft für Informatik – Informatik für Gesellschaft, 2019) Galke, Lukas; Mai, Florian; Scherp, AnsgarWe summarize our contribution to the International Conference on Learning Representations CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model, 2019.We construct a text encoder that learns matrix representations of words from unlabeled text, while using matrix multiplication as composition function. We show that our text encoder outperforms continuous bag-of-word representations on 9 out of 10 linguistic probing tasks and argue that the learned representations are complementary to the ones of vector-based approaches. Hence, we construct a hybrid model that jointly learns a matrix and a vector for each word. This hybrid model yields higher scores than purely vector-based approaches on 10 out of 16 downstream tasks in a controlled experiment with the same capacity and training data. Across all 16 tasks, the hybrid model achieves an average improvement of 1.2%. These results are insofar promising, as they open up new opportunities to efficiently incorporate order awareness into word embedding models.