GI LogoGI Logo
  • Login
Digital Library
    • All of DSpace

      • Communities & Collections
      • Titles
      • Authors
      • By Issue Date
      • Subjects
    • This Collection

      • Titles
      • Authors
      • By Issue Date
      • Subjects
Digital Library Gesellschaft für Informatik e.V.
GI-DL
    • English
    • Deutsch
  • English 
    • English
    • Deutsch
View Item 
  •   DSpace Home
  • Fachbereiche
  • Informatik und Gesellschaft (IuG)
  • Fachgruppe Informatik und Digital Humanities
  • Workshop INF-DH - 2018
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.
  •   DSpace Home
  • Fachbereiche
  • Informatik und Gesellschaft (IuG)
  • Fachgruppe Informatik und Digital Humanities
  • Workshop INF-DH - 2018
  • View Item

Handwritten Text Recognition Error Rate Reduction in Historical Documents using Naive Transcribers

Author:
Christlein, Vincent [DBLP] ;
Nicolaou, Anguelos [DBLP] ;
Schlauwitz, Thorsten [DBLP] ;
Späth, Sabrina [DBLP] ;
Herbers, Klaus [DBLP] ;
Maier, Andreas [DBLP]
Abstract
Handwritten text recognition (HTR) is a difficult research problem. In particular for historical documents, this task is hard as handwriting style, orthography, and text quality pose significant challenges. Creation of a single multi-purpose HTR system seems to be out of reach for current state-of-the-art systems. Therefore, we are interested in fast creation of specialized HTR systems for a particular set of historical documents. Still manual annotation by historical experts is expensive and can often not be applied at a large scale. Instead, we use the transcripts of naive transcribers that may still contain a significant amount of errors. In this paper, we propose to fuse the recognized word-chain with naive transcribers that can be obtained in a cost-effective way. For the actual fusion, we rely on a word-level approach, the so-called Recognizer Output Voting Error Reduction (ROVER). Results indicate that we are able to reduce the Word Error Rate (WER) of an HTR system trained with only few pages from 2.6 % to 19.2% with two additional transcribers with 25.1% and 27.1% WER each. This performance is already close to current state-of-the-art systems trained with significantly more data.
  • Citation
  • BibTeX
Christlein, V., Nicolaou, A., Schlauwitz, T., Späth, S., Herbers, K. & Maier, A., (2018). Handwritten Text Recognition Error Rate Reduction in Historical Documents using Naive Transcribers. In: Burghardt, M. & Müller-Birn, C. (Hrsg.), INF-DH-2018. Bonn: Gesellschaft für Informatik e.V.. DOI: 10.18420/infdh2018-13
@inproceedings{mci/Christlein2018,
author = {Christlein, Vincent AND Nicolaou, Anguelos AND Schlauwitz, Thorsten AND Späth, Sabrina AND Herbers, Klaus AND Maier, Andreas},
title = {Handwritten Text Recognition Error Rate Reduction in Historical Documents using Naive Transcribers},
booktitle = {INF-DH-2018},
year = {2018},
editor = {Burghardt, Manuel AND Müller-Birn, Claudia} ,
doi = { 10.18420/infdh2018-13 },
publisher = {Gesellschaft für Informatik e.V.},
address = {Bonn}
}
DateienGroesseFormatAnzeige
INF-DH-2018_paper_13.pdf2.379Mb PDF View/Open

Sollte hier kein Volltext (PDF) verlinkt sein, dann kann es sein, dass dieser aus verschiedenen Gruenden (z.B. Lizenzen oder Copyright) nur in einer anderen Digital Library verfuegbar ist. Versuchen Sie in diesem Fall einen Zugriff ueber die verlinkte DOI: 10.18420/infdh2018-13

Haben Sie fehlerhafte Angaben entdeckt? Sagen Sie uns Bescheid: Send Feedback

More Info

DOI: 10.18420/infdh2018-13
xmlui.MetaDataDisplay.field.date: 2018
Language: en (en)
Content Type: Text/Workshop Paper

Keywords

  • handwritten text recognition
  • naive transcribers
  • line fusion
Collections
  • Workshop INF-DH - 2018 [17]

Show full item record


About uns | FAQ | Help | Imprint | Datenschutz

Gesellschaft für Informatik e.V. (GI), Kontakt: Geschäftsstelle der GI
Diese Digital Library basiert auf DSpace.

 

 


About uns | FAQ | Help | Imprint | Datenschutz

Gesellschaft für Informatik e.V. (GI), Kontakt: Geschäftsstelle der GI
Diese Digital Library basiert auf DSpace.