Logo des Repositoriums
 

WannaDB: Ad-hoc SQL Queries over Text Collections

dc.contributor.authorHättasch, Benjamin
dc.contributor.authorBodensohn, Jan-Micha
dc.contributor.authorVogel, Liane
dc.contributor.authorUrban, Matthias
dc.contributor.authorBinnig, Carsten
dc.contributor.editorKönig-Ries, Birgitta
dc.contributor.editorScherzinger, Stefanie
dc.contributor.editorLehner, Wolfgang
dc.contributor.editorVossen, Gottfried
dc.date.accessioned2023-02-23T14:00:24Z
dc.date.available2023-02-23T14:00:24Z
dc.date.issued2023
dc.description.abstractn this paper, we propose a new system called WannaDB that allows users to interactively perform structured explorations of text collections in an ad-hoc manner. Extracting structured data from text is a classical problem where a plenitude of approaches and even industry-scale systems already exists. However, these approaches lack in the ability to support the ad-hoc exploration of texts using structured queries. The main idea of WannaDB is to include user interaction to support ad-hoc SQL queries over text collections using a new two-phased approach. First, a superset of information nuggets from the texts is extracted using existing extractors such as named entity recognizers. Then, the extractions are interactively matched to a structured table definition as requested by the user based on embeddings. In our evaluation, we show that WannaDB is thus able to extract structured data from a broad range of (real-world) text collections in high quality without the need to design extraction pipelines upfront.en
dc.identifier.doi10.18420/BTW2023-08
dc.identifier.isbn978-3-88579-725-8
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/40390
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofBTW 2023
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-331
dc.subjectinteractive text exploration
dc.subjecttext to table
dc.subjectmatching embeddings
dc.titleWannaDB: Ad-hoc SQL Queries over Text Collectionsen
dc.typeText/Conference Paper
gi.citation.endPage181
gi.citation.publisherPlaceBonn
gi.citation.startPage157
gi.conference.date06.-10. März 2023
gi.conference.locationDresden, Germany

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
B2-1.pdf
Größe:
1.16 MB
Format:
Adobe Portable Document Format