Logo des Repositoriums
 

Digitizing Drilling Logs - Challenges of typewritten forms

dc.contributor.authorBürgl, Kim
dc.contributor.authorReinhardt, Lea
dc.contributor.authorBinder, Frank
dc.contributor.authorMüller, Lydia
dc.contributor.authorNiekler, Andreas
dc.date.accessioned2021-12-14T10:57:35Z
dc.date.available2021-12-14T10:57:35Z
dc.date.issued2021
dc.description.abstractIn this work, we show prospects of how mining and geological documentation in the form of drilling reports can be digitized and further processed. Processing these typed and handwritten forms poses challenges for document management in renaturation projects. We highlight the structural problems of drilling reports and present three approaches for recognizing and processing the information documented in them. We use optical character recognition and document layout analysis techniques to approach the problem. Layout analysis was performed using a heuristic approach and a neural network for layout recognition. In detail, we show the approaches Form Processing (A), Table detection by line counting (B) and processing with Mask-R-CNN (C). A case study is used to show initial results and challenges. B and C are more robust than A to small changes in the form. C can recognize columns better with more training data than B in cases where table boundaries are not respected. B and C also allow other language models to be used for OCR and can thus also recognize handwriting with appropriate training data.en
dc.identifier.doi10.18420/informatik2021-059
dc.identifier.isbn978-3-88579-708-1
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/37725
dc.language.isoen
dc.publisherGesellschaft für Informatik, Bonn
dc.relation.ispartofINFORMATIK 2021
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-314
dc.subjectdrilling logs
dc.subjecttable recognition
dc.subjectinformation extraction
dc.subjectrenaturation projects
dc.subjectOCR
dc.subjectforms processing
dc.titleDigitizing Drilling Logs - Challenges of typewritten formsen
gi.citation.endPage718
gi.citation.startPage709
gi.conference.date27. September - 1. Oktober 2021
gi.conference.locationBerlin
gi.conference.sessiontitleWorkshop: Workshop on Systems to Support Renaturation Projects (SyRePro 2021)

Dateien

Originalbündel
1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
G1-3.pdf
Größe:
1.04 MB
Format:
Adobe Portable Document Format