Logo des Repositoriums
 

Enhancing named entity extraction by effectively incorporating the crowd

dc.contributor.authorBraunschweig, Katrin
dc.contributor.authorThiele, Maik
dc.contributor.authorEberius, Julian
dc.contributor.authorLehner, Wolfgang
dc.contributor.editorSaake, Gunter
dc.contributor.editorHenrich, Andreas
dc.contributor.editorLehner, Wolfgang
dc.contributor.editorNeumann, Thomas
dc.contributor.editorKöppen, Veit
dc.date.accessioned2018-10-24T10:44:45Z
dc.date.available2018-10-24T10:44:45Z
dc.date.issued2013
dc.description.abstractNamed entity extraction is an established research area in the field of information extraction. When tailored to a specific domain and with sufficient pre-labeled training data, state-of-the-art extraction algorithms have achieved near human performance. However, when presented with semi-structured data, informal text or unknown domains where training data is not available, extraction results can deteriorate significantly. Recent research has focused on crowdsourcing as an alternative to automatic named entity extraction or as a tool to generate the required training data. While humans easily adapt to semi-structured data and informal style, a crowd-based approach also introduces new issues due to monetary costs or spamming. We address these issues by combining automatic named entity extraction algorithms with crowdsourcing into a hybrid approach. We have conducted a wide range of experiments on real world data to identify a set of subtasks or operators, that can be performed either by the crowd or automatically. Results show that a meaningful combination of these operators into complex processing pipelines can significantly enhance the quality of named entity extraction in challenging scenarios, while at the same time reducing the monetary costs of crowdsourcing and the risk of misuse.en
dc.identifier.isbn978-3-88579-610-7
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/17432
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofDatenbanksysteme für Business, Technologie und Web (BTW) 2013 - Workshopband
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-216
dc.titleEnhancing named entity extraction by effectively incorporating the crowden
dc.typeText/Conference Paper
gi.citation.endPage196
gi.citation.publisherPlaceBonn
gi.citation.startPage181
gi.conference.date11.-12. März 2013
gi.conference.locationMagdeburg
gi.conference.sessiontitleRegular Research Papers

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
181.pdf
Größe:
1.98 MB
Format:
Adobe Portable Document Format