Textdokument
Umbra as a Time Machine
Lade...
Volltext URI
Dokumententyp
Dateien
Zusatzinformation
Datum
2021
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Quelle
Verlag
Gesellschaft für Informatik, Bonn
Zusammenfassung
Online lexicons such as Wikipedia rely on incremental edits that change text strings marginally. To support text versioning inside of the Umbra database system, this study presents the implementation of a dedicated data type. This versioning data type is designed for maximal throughput as it stores the latest string as a whole and computes previous ones using backward diffs. Using this data type for Wikipedia articles, we achieve a compression rate of up to 5% and outperform the traditional text data type, when storing each version as one tuple individually, by an order of magnitude.