Auflistung nach Schlagwort "corpus"
1 - 2 von 2
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragA Corpus of Memes from Reddit: Acquisition, Preparation and First Case Studies(INFORMATIK 2023 - Designing Futures: Zukünfte gestalten, 2023) Schmidt, Thomas; Schiller, Fabian; Götz, Matthias; Wolff, ChristianWe present a corpus of memes and their textual components that were acquired from the popular meme platform r\memes, a subreddit of Reddit and one of the major outlets of online meme culture. The corpus consists of the most popular memes from 2013-2021 on the platform and we acquired 11,701 memes and 280,351 text tokens. We conduct several case studies focused on diachronic analysis to highlight the possibilities of the corpus for research in internet studies and online culture. We examine the general activity on the platform throughout the years and identify a significant increase in meme production beginning 2017. Results of sentiment analysis show a tendency towards memes with positively classified texts. The analysis of most frequent words per half-year spotlights the importance of certain cultural events for meme culture (e.g. the 2016 US election). Using the LIWC to analyze swear and sexual words shows an overall decrease in the usage of these words pointing to an increased moderation of the platform. The corpus is publicly available for the research community for further studies.
- KonferenzbeitragThe InsightsNet Climate Change Corpus (ICCC)(BTW 2023, 2023) Bartsch, Sabine; Duan, Changxu; Tan, Sherry; Volkanovska, Elena; Stille, WolfgangThe discourse on climate change has become a centerpiece of public debate, thereby creating a pressing need to analyze the multitude of messages created by the participants in this communication process. In addition to text, messages on this topic are communicated through images, videos, tables and other data objects that are embedded within a document and accompany the text. This paper presents the process of building the InsightsNet Climate Change Corpus (ICCC), a multimodal corpus on the topic of climate change, using NLP tools to enrich corpus metadata, a dataset that lends itself to the exploration of the interplay between the various modalities that constitute the discourse on climate change.