Logo des Repositoriums
 
Zeitschriftenartikel

On divergence-based author obfuscation: An attack on the state of the art in statistical authorship verification

Vorschaubild nicht verfügbar

Volltext URI

Dokumententyp

Text/Journal Article

Zusatzinformation

Datum

2020

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

De Gruyter

Zusammenfassung

Authorship verification is the task of determining whether two texts were written by the same author based on a writing style analysis. Author obfuscation is the adversarial task of preventing a successful verification by altering a text’s style so that it does not resemble that of its original author anymore. This paper introduces new algorithms for both tasks and reports on a comprehensive evaluation to ascertain the merits of the state of the art in authorship verification to withstand obfuscation. After introducing a new generalization of the well-known unmasking algorithm for short texts, thus completing our collection of state-of-the-art algorithms for verification, we introduce an approach that (1) models writing style difference as the Jensen-Shannon distance between the character n-gram distributions of texts, and (2) manipulates an author’s writing style in a sophisticated manner using heuristic search. For obfuscation, we explore the huge space of textual variants in order to find a paraphrased version of the to-be-obfuscated text that has a sufficiently high Jensen-Shannon distance at minimal costs in terms of text quality loss. We analyze, quantify, and illustrate the rationale of this approach, define paraphrasing operators, derive text length-invariant thresholds for termination, and develop an effective obfuscation framework. Our authorship obfuscation approach defeats the presented state-of-the-art verification approaches, while keeping text changes at a minimum. As a final contribution, we discuss and experimentally evaluate a reverse obfuscation attack against our obfuscation approach as well as possible remedies.

Beschreibung

Bevendorff, Janek; Wenzel, Tobias; Potthast, Martin; Hagen, Matthias; Stein, Benno (2020): On divergence-based author obfuscation: An attack on the state of the art in statistical authorship verification. it - Information Technology: Vol. 62, No. 2. DOI: 10.1515/itit-2019-0046. Berlin: De Gruyter. PISSN: 2196-7032. pp. 99-115

Zitierform

Tags