Show simple item record

dc.contributor.authorSchenkel, Ralf
dc.contributor.authorSuchanek, Fabian
dc.contributor.authorKasneci, Gjergji
dc.contributor.editorKemper, Alfons
dc.contributor.editorSchöning, Harald
dc.contributor.editorRose, Thomas
dc.contributor.editorJarke, Matthias
dc.contributor.editorSeidl, Thomas
dc.contributor.editorQuix, Christoph
dc.contributor.editorBrochhaus, Christoph
dc.date.accessioned2020-02-11T13:22:04Z
dc.date.available2020-02-11T13:22:04Z
dc.date.issued2007
dc.identifier.isbn978-3-88579-197-3
dc.identifier.issn1617-5468
dc.identifier.urihttp://dl.gi.de/handle/20.500.12116/31804
dc.description.abstractThe paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce algorithms to annotate pages and links with concepts from the WordNet thesaurus. This annotation process exploits categorical information in Wikipedia, which is a high-quality, manually assigned source of information, extracts additional information from lists, and utilizes the invocations of templates with named parameters. We give examples how such annotations can be exploited for high-precision queries.en
dc.language.isoen
dc.publisherGesellschaft für Informatik e. V.
dc.relation.ispartofDatenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS)
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-103
dc.titleYAWN: A Semantically Annotated Wikipedia XML Corpusen
dc.typeText/Conference Paper
dc.pubPlaceBonn
mci.reference.pages277-291
mci.conference.sessiontitleRegular Research Papers
mci.conference.locationAachen
mci.conference.date07.-09.03.2007


Files in this item

Thumbnail

Show simple item record