Auflistung nach Autor:in "Christlein, Vincent"
1 - 4 von 4
Treffer pro Seite
Sortieroptionen
- WorkshopbeitragHandwritten Text Recognition Error Rate Reduction in Historical Documents using Naive Transcribers(INF-DH-2018, 2018) Christlein, Vincent; Nicolaou, Anguelos; Schlauwitz, Thorsten; Späth, Sabrina; Herbers, Klaus; Maier, AndreasHandwritten text recognition (HTR) is a difficult research problem. In particular for historical documents, this task is hard as handwriting style, orthography, and text quality pose significant challenges. Creation of a single multi-purpose HTR system seems to be out of reach for current state-of-the-art systems. Therefore, we are interested in fast creation of specialized HTR systems for a particular set of historical documents. Still manual annotation by historical experts is expensive and can often not be applied at a large scale. Instead, we use the transcripts of naive transcribers that may still contain a significant amount of errors. In this paper, we propose to fuse the recognized word-chain with naive transcribers that can be obtained in a cost-effective way. For the actual fusion, we rely on a word-level approach, the so-called Recognizer Output Voting Error Reduction (ROVER). Results indicate that we are able to reduce the Word Error Rate (WER) of an HTR system trained with only few pages from 2.6 % to 19.2% with two additional transcribers with 25.1% and 27.1% WER each. This performance is already close to current state-of-the-art systems trained with significantly more data.
- TextdokumentProof of Concept: Automatic Type Recognition(INFORMATIK 2020, 2021) Christlein, Vincent; Weichselbaumer, Nikolaus; Limbach, Saskia; Seuret, MathiasThe type used to print an early modern book can give scholars valuable information about the time and place of its production as well as its producer. Recognizing such type is currently done manually using both the character shapes of 'M' or 'Qu' and the size of the total type to look it up in a large reference work. This is a reliable method, but it is also slow and requires specific skills. We investigate the performance of type classification and type retrieval using a newly created dataset consisting of easy and difficult types used in early printed books. For type classification, we rely on a deep Convolutional Neural Network (CNN) originally used for font-group classification while we use a common writer identification method for the retrieval case. We show that in both scenarios, easy types can be classified/retrieved with a high accuracy while difficult cases are indeed difficult.
- KonferenzbeitragA study on features for the detection of copy-move forgeries(Sicherheit 2010. Sicherheit, Schutz und Zuverlässigkeit, 2010) Christlein, Vincent; Riess, Christian; Angelopoulou, ElliBlind image forensics aims to assess image authenticity. One of the most popular families of methods from this category is the detection of copy-move forgeries. In the past years, more than 15 different algorithms have been proposed for copy-move forgery detection. So far, the efficacy of these approaches has barely been examined. In this paper, we: a) present a common pipeline for copy-move forgery detection, b) perform a comparative study on 10 proposed copy-move features and c) introduce a new benchmark database for copy-move forgery detection. Experiments show that the recently proposed Fourier-Mellin features perform outstandingly if no geometric transformations are applied to the copied region. Furthermore, our experiments strongly support the use of kd-trees for the matching of similar blocks instead of lexicographic sorting.
- TextdokumentTowards End-to-End Deep Learning-based Writer Identification(INFORMATIK 2020, 2021) Wang, Zhenghua; Maier, Andreas; Christlein, VincentWriter identification is an important task to gain knowledge about life in the past, which is commonly solved by paleographic experts. In this work, we investigate an automatic writer identification procedure based on deep learning. So far, the most approaches are based on two or more different pipeline steps and only few of them can be trained in an end-to-end manner. In this paper, we propose a fully end-to-end deep learning-based model, which consists of a U-Net for binarization, a ResNet-50 for feature extraction, and an optimized learnable residual encoding layer to obtain global descriptors. We evaluate the proposed end-to-end model on the ICDAR17 competition dataset on historical document writer identification (Historical-WI) dataset. Moreover, we investigate the performance of our optimized encoding layer on three texture datasets. While the optimized encoding layer does not work well in the task of writer identification, it provides better performance on the texture datasets. Furthermore, we show that a pre-trained U-Net can improve the performance for writer identification.