Logo des Repositoriums
 
Konferenzbeitrag

Detecting plagiarism in text documents through grammar-analysis of authors

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2013

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

The task of intrinsic plagiarism detection is to find plagiarized sections within text documents without using a reference corpus. In this paper, the intrinsic detection approach Plag-Inn is presented which is based on the assumption that authors use a recognizable and distinguishable grammar to construct sentences. The main idea is to analyze the grammar of text documents and to find irregularities within the syntax of sentences, regardless of the usage of concrete words. If suspicious sentences are found by computing the pq-gram distance of grammar trees and by utilizing a Gaussian normal distribution, the algorithm tries to select and combine those sentences into potentially plagiarized sections. The parameters and thresholds needed by the algorithm are optimized by using genetic algorithms. Finally, the approach is evaluated against a large test corpus consisting of English documents, showing promising results.

Beschreibung

Tschuggnall, Michael; Specht, Günther (2013): Detecting plagiarism in text documents through grammar-analysis of authors. Datenbanksysteme für Business, Technologie und Web (BTW) 2028. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-608-4. pp. 241-259. Regular Research Papers. Magdeburg. 13.-15. März 2013

Schlagwörter

Zitierform

DOI

Tags