Auflistung nach Autor:in "Petersohn, Uwe"
1 - 2 von 2
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragOptimizing Similarity Search in the M-Tree(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Guhlemann, Steffen; Petersohn, Uwe; Meyer-Wegener, KlausA topic of growing interest in a wide range of domains is the similarity of data entries. Data sets of genome sequences, text corpora, complex production information, and multimedia content are typically large and unstructured, and it is expensive to compute similarities in them. The only common denominator a data structure for e cient similarity search can rely on are the metric axioms. One such data structure for e cient similarity search in metric spaces is the M-Tree, along with a number of compatible extensions (e.g. Slim-Tree, Bulk Loaded M-Tree, multiway insertion M-Tree, M2-Tree, etc.). The M-Tree family uses common algorithms for the k-nearest-neighbor and range search. In this paper we present new algorithms for these tasks to considerably improve retrieval performance of all M-Tree-compatible data structures.
- ZeitschriftenartikelReducing the Distance Calculations when Searching an M‑Tree(Datenbank-Spektrum: Vol. 17, No. 2, 2017) Guhlemann, Steffen; Petersohn, Uwe; Meyer-Wegener, KlausRecent years have brought rising interest in efficiently searching for similar entities in a broad range of domains. Such search can be used to facilitate working with unstructured data such as genome sequences, text corpora, complex production information, or multimedia content, where queries always contain an amount of noise. In such domains the only common structure is a distance function obeying the axioms of a metric. As mostly no other structure information is available, a lot of distances have to be computed during the course of a search. Contrary to classical database indexes, where the optimization focus is on reducing the number of disk accesses (or in case of in-memory databases the number of tree traversal operations), a major cost driver in such multimedia domains is this number of distance calculations which can be very computation intense.There exists a range of index structures for supporting similarity search in metric spaces. A very promising one is the M‑Tree, along with a number of compatible extensions (e. g. Slim-Tree, Bulk Loaded M‑Tree, multi way insertion M‑Tree, $$M^{2}$$M2-Tree, etc.). The M‑Tree family uses common algorithms for the $$k$$k-nearest-neighbor and range search. These algorithms leave room for optimization in terms of necessary distance calculations. In this paper we present new algorithms for these tasks to considerably improve retrieval performance of all M‑Tree-compatible data structures.