Stance Detection Benchmark: How Robust is Your Stance Detection?

Schiller, Benjamin; Daxenberger, Johannes; Gurevych, Iryna

Stance Detection Benchmark: How Robust is Your Stance Detection?

dc.contributor.author	Schiller, Benjamin
dc.contributor.author	Daxenberger, Johannes
dc.contributor.author	Gurevych, Iryna
dc.date.accessioned	2021-12-16T13:23:01Z
dc.date.available	2021-12-16T13:23:01Z
dc.date.issued	2021
dc.description.abstract	Stance detection (StD) aims to detect an author’s stance towards a certain topic and has become a key component in applications like fake news detection, claim validation, or argument search. However, while stance is easily detected by humans, machine learning (ML) models are clearly falling short of this task. Given the major differences in dataset sizes and framing of StD (e.g. number of classes and inputs), ML models trained on a single dataset usually generalize poorly to other domains. Hence, we introduce a StD benchmark that allows to compare ML models against a wide variety of heterogeneous StD datasets to evaluate them for generalizability and robustness. Moreover, the framework is designed for easy integration of new datasets and probing methods for robustness. Amongst several baseline models, we define a model that learns from all ten StD datasets of various domains in a multi-dataset learning (MDL) setting and present new state-of-the-art results on five of the datasets. Yet, the models still perform well below human capabilities and even simple perturbations of the original test samples (adversarial attacks) severely hurt the performance of MDL models. Deeper investigation suggests overfitting on dataset biases as the main reason for the decreased robustness. Our analysis emphasizes the need of focus on robustness and de-biasing strategies in multi-task learning approaches. To foster research on this important topic, we release the dataset splits, code, and fine-tuned weights.	de
dc.identifier.doi	10.1007/s13218-021-00714-w
dc.identifier.pissn	1610-1987
dc.identifier.uri	http://dx.doi.org/10.1007/s13218-021-00714-w
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/37819
dc.publisher	Springer
dc.relation.ispartof	KI - Künstliche Intelligenz: Vol. 35, No. 0
dc.relation.ispartofseries	KI - Künstliche Intelligenz
dc.subject	Multi-dataset learning
dc.subject	Robustness
dc.subject	Stance detection
dc.title	Stance Detection Benchmark: How Robust is Your Stance Detection?	de
dc.type	Text/Journal Article
gi.citation.endPage	341
gi.citation.startPage	329

Sammlungen

Künstliche Intelligenz 35(3-4) - Oktober 2021

Stance Detection Benchmark: How Robust is Your Stance Detection?

Dateien

Sammlungen