Discovering Multi-Dimensional Subsequence Queries from Traces -- From Theory to Practice

Vorschaubild nicht verfügbar
Volltext URI
Text/Conference Paper
ISSN der Zeitschrift
BTW 2023
Gesellschaft für Informatik e.V.
Subsequence-queries with wildcards and gap-size constraints (swg-queries, for short) are an expressive model for sequence data, in which queries are described by patterns over an alphabet of variables and types, along with a global window size and a number of gap-size constraints. They are evaluated over a trace, i.e., a sequence of types, by replacing variables by single types, while satisfying the window and the gap-size constraints. Kleest-Meißner et al. (Proc. ICDT 2022) formalised the task of discovering an swg-query that describes best a given sample consisting of a finite number of traces, and developed a discovery algorithm solving this task. However, in practical application scenarios, traces are often multi-dimensional, i.e., a trace corresponds to a sequence of tuples of types, which renders the existing technique inapplicable.In this paper, we lift the notion of swg-queries to such a multi-dimensional setting, thereby enlarging the applicability of the query model and the techniques for query discovery. We introduce a mapping between one-dimensional and multi-dimensional sequence data, such that a multi-dimensional trace matches a multi-dimensional query if and only if the corresponding one-dimensional trace matches the corresponding one-dimensional query. We complement our formal results with a description of our prototypical implementation of query discovery for multi-dimensional sequence data. Results from evaluation experiments with real-world data indicate feasibility of our approach.
Kleest-Meißner, Sarah; Sattler, Rebecca; Schmid, Markus L.; Schweikardt, Nicole; Weidlich, Matthias (2023): Discovering Multi-Dimensional Subsequence Queries from Traces -- From Theory to Practice. BTW 2023. DOI: 10.18420/BTW2023-24. Bonn: Gesellschaft für Informatik e.V.. ISBN: 978-3-88579-725-8. pp. 511-533. Dresden, Germany. 06.-10. März 2023