On the State of German (Abstractive) Text Summarization

Aumiller, Dennis; Fan, Jing; Gertz, Michael

On the State of German (Abstractive) Text Summarization

dc.contributor.author	Aumiller, Dennis
dc.contributor.author	Fan, Jing
dc.contributor.author	Gertz, Michael
dc.contributor.editor	König-Ries, Birgitta
dc.contributor.editor	Scherzinger, Stefanie
dc.contributor.editor	Lehner, Wolfgang
dc.contributor.editor	Vossen, Gottfried
dc.date.accessioned	2023-02-23T13:59:45Z
dc.date.available	2023-02-23T13:59:45Z
dc.date.issued	2023
dc.description.abstract	With recent advancements in the area of Natural Language processing, the focus is slowly shifting from a purely English-centric view towards more language-specific solutions, including German.Especially practical for businesses to analyze their growing amount of textual data are text summarization systems, which transform long input documents into compressed and more digestible summary texts.In this work, we assess the particular landscape of German abstractive text summarization and investigate the reasons why practically useful solutions for abstractive text summarization are still absent in industry. Our focus is two-fold, analyzing a) training resources, and b) publicly available summarization systems.We are able to show that popular existing datasets exhibit crucial flaws in their assumptions about the original sources, which frequently leads to detrimental effects on system generalization and evaluation biases. We confirm that for the most popular training dataset, MLSUM, over 50% of the training set is unsuitable for abstractive summarization purposes. Furthermore, available systems frequently fail to compare to simple baselines, and ignore more effective and efficient extractive summarization approaches. We attribute poor evaluation quality to a variety of different factors, which are investigated in more detail in this work:A lack of qualitative (and diverse) gold data considered for training, understudied (and untreated) positional biases in some of the existing datasets, and the lack of easily accessible and streamlined pre-processing strategies or analysis tools. We therefore provide a comprehensive assessment of available models on the cleaned versions of datasets, and find that this can lead to a reduction of more than 20 ROUGE-1 points during evaluation. As a cautious reminder for future work, we finally highlight the problems of solely relying on n-gram based scoring methods by presenting particularly problematic failure cases. Code for dataset filtering and reproducing results can be found online: https://github.com/anonymized-user/anonymized-repository	en
dc.identifier.doi	10.18420/BTW2023-10
dc.identifier.isbn	978-3-88579-725-8
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/40314
dc.language.iso	en
dc.publisher	Gesellschaft für Informatik e.V.
dc.relation.ispartof	BTW 2023
dc.relation.ispartofseries	Lecture Notes in Informatics (LNI) - Proceedings, Volume P-331
dc.subject	Abstractive Text Summarization
dc.subject	Natural Language Generation
dc.subject	German
dc.subject	Evaluation
dc.title	On the State of German (Abstractive) Text Summarization	en
dc.type	Text/Conference Paper
gi.citation.endPage	220
gi.citation.publisherPlace	Bonn
gi.citation.startPage	195
gi.conference.date	06.-10. März 2023
gi.conference.location	Dresden, Germany

Dateien

Originalbündel

1 - 1 von 1

Name:: B2-3.pdf
Größe:: 692.48 KB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

P331 - BTW2023- Datenbanksysteme für Business, Technologie und Web