P214 - BTW2013 - Datenbanksysteme für Business, Technologie und Web
Autor*innen mit den meisten Dokumenten
Auflistung nach:
Neueste Veröffentlichungen
- KonferenzbeitragLogical recovery from single-page failures(Datenbanksysteme für Business, Technologie und Web (BTW) 2021, 2013) Graefe, Goetz; Seeger, Bernhard; Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, VeitModern hardware technologies and ever-increasing data sizes increase probability and frequency of local storage failures, e.g., unrecoverable read errors on individual disk sectors or pages on flash storage. Our prior work has formalized singlepage failures and outlined efficient methods for their detection and recovery. These prior techniques rely on old backup copies of individual pages, e.g., as part of a database backup or as old versions retained after a page migration. Those might not be available, however, e.g., after recent index creation in “non-logged” or “allocation-only logging” mode, which industrial database products commonly use. The present paper introduces techniques for single-page recovery without backup copies, e.g., pages of new indexes created in allocation-only logging mode. By rederiving lost contents of individual pages, these techniques enable efficient recovery of data lost due to damaged storage structures or storage devices. Recovery performance depends on the size of the failure and of the required data sources; it is independent of the sizes of device, index structure, etc.
- KonferenzbeitragTaking the edge off cardinality estimation errors using incremental execution(Datenbanksysteme für Business, Technologie und Web (BTW) 2019, 2013) Neumann, Thomas; Galindo-Legaria, Cesar; Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, VeitQuery optimization is an essential ingredient for efficient query processing, as semantically equivalent execution alternatives can have vastly different runtime behavior. The query optimizer is largely driven by cardinality estimates when selecting execution alternatives. Unfortunately these estimates are largely inaccurate, in particular for complex predicates or skewed data. We present an incremental execution framework to make the query optimizer more resilient to cardinality estimation errors. The framework computes the sensitivity of execution plans relative to cardinality estimation errors, and if necessary executes parts of the query to remove uncertainty. This technique avoids optimization decisions based upon gross misestimation, and makes query optimization (and thus processing) much more robust. We demonstrate the effectiveness of these techniques on large real-world and synthetic data sets.
- KonferenzbeitragResource description and selection for range query processing in general metric spaces(Datenbanksysteme für Business, Technologie und Web (BTW) 2020, 2013) Blank, Daniel; Henrich, Andreas; Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, VeitSimilarity search in general metric spaces is a key aspect in many application fields. Metric space indexing provides a flexible indexing paradigm and is solely based on the use of a distance metric. No assumption is made about the representation of the database objects. Nowadays, ever-increasing data volumes require large-scale distributed retrieval architectures. Here, local and global indexing schemes are distinguished. In the local indexing approach, every resource administers a set of documents and indexes them locally. Resource descriptions providing the basis for resource selection can be disseminated to avoid all resources being contacted when answering a query. On the other hand, global indexing schemes are based on a single index which is distributed so that every resource is responsible for a certain part of the index. For local indexing, only few exact approaches have been proposed which support general metric space indexing. In this paper, we introduce RS4MI-an exact resource selection approach for general metric space indexing. We compare RS4MI with approaches presented in literature based on a peer-to-peer scenario when searching for similar images by image content. RS4MI can outperform two exact general metric space resource selection schemes in case of range queries. Fewer resources are contacted by RS4MI with-at the same time-more space efficient resource descriptions.
- KonferenzbeitragExtending the MPSM join(Datenbanksysteme für Business, Technologie und Web (BTW) 2018, 2013) Albutiu, Martina-Cezara; Kemper, Alfons; Neumann, Thomas; Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, VeitHardware vendors are improving their (database) servers in two main aspects: (1) increasing main memory capacities of several TB per server, mostly with non-uniform memory access (NUMA) among sockets, and (2) massively parallel multi-core processing. While there has been research on the parallelization of database operations, still many algorithmic and control techniques in current database technology were devised for disk-based systems where I/O dominated the performance. Furthermore, NUMA has only recently caught the community's attention. In [AKN12], we analyzed the challenges that modern hardware poses to database algorithms on a 32-core machine with 1 TB of main memory (four NUMA partitions) and derived three rather simple rules for NUMA-affine scalable multi-core parallelization. Based on our findings, we developed MPSM, a suite of massively parallel sort-merge join algorithms, and showed its competitive performance on large main memory databases with billions of objects. In this paper, we go one step further and investigate the effectiveness of MPSM for non-inner join variants and complex query plans. We show that for noninner join variants, MPSM incurs no extra overhead. Further, we point out ways of exploiting the roughly sorted output of MPSM in subsequent joins. In our evaluation, we compare these ideas to the basic execution of sequential MPSM joins and find that the original MPSM performs very well in complex query plans.
- KonferenzbeitragProQua: Ein Probabilistisches Datenbanksystem für die Auswertung von Ähnlichkeitsanfragen auf unsicheren Datengrundlagen(Datenbanksysteme für Business, Technologie und Web (BTW) 2052, 2013) Lehrack, Sebastian; Saretz, Sascha; Winkel, Christian; Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, VeitProQua ist ein neuartiges probabilistisches Datenbanksystem, welches die Auswertung von gewichteten logikbasierten Ähnlichkeitsbedingungen auf einer unsicheren Datenbasis zum Ziel hat. Die wesentlichen Leistungsmerkmale von ProQua werden anhand eines Bespielszenarios aus dem Umfeld der Archäologie präsentiert.
- KonferenzbeitragPeRA: individual privacy control in intelligent transportation systems(Datenbanksysteme für Business, Technologie und Web (BTW) 2053, 2013) Kost, Martin; Dzikowski, Raffael; Freytag, Johann-Christoph; Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, VeitIn the domain of Intelligent Transportation Systems (ITS) manufacturers and service providers start to implement and deploy plenty of (new) applications running on a vehicle. These applications involve the user and external services. Therefore, we must incorporate mechanisms providing the individual for controlling his/her privacy. Existing approaches only consider to control the event of data access using a central instance. In contrast, we consider to implement individual privacy requirements for the complete data flow of distributed systems. The Privacy-enforcing Runtime Architecture (PeRA) provides a holistic privacy protection approach, which implements user-defined privacy policies. A data-centric protection chain ensures that ITS components process data according to attached privacy policies. PeRA instances constitute a distributed privacy middleware, which evaluates privacy policies to mediate data access by applications. The PeRA architecture includes an integrity protection layer to create a distributed policy enforcement perimeter between ITS nodes, which prevents the circumvention of policies. We implemented the PeRA architecture as a proof-of-concept prototype.
- Editiertes BuchDatenbanksysteme für Business, Technologie und Web (BTW) 2013(2013) Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, Veit
- KonferenzbeitragMR-DSJ: distance-based self-join for large-scale vector data analysis with mapreduce(Datenbanksysteme für Business, Technologie und Web (BTW) 2017, 2013) Seidl, Thomas; Fries, Sergej; Boden, Brigitte; Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, VeitData analytics gets faced with huge and tremendously increasing amounts of data for which MapReduce provides a very convenient and effective distributed programming model. Various algorithms already support massive data analysis on computer clusters but, in particular, distance-based similarity self-joins lack efficient solutions for large vector data sets though they are fundamental in many data mining tasks including clustering, near-duplicate detection or outlier analysis. Our novel distance-based self-join algorithm for MapReduce, MR-DSJ, is based on grid partitioning and delivers correct, complete, and inherently duplicate-free results in a single iteration. Additionally we propose several filter techniques which reduce the runtime and communication of the MR-DSJ algorithm. Analytical and experimental evaluations demonstrate the superiority over other join algorithms for MapReduce.
- KonferenzbeitragA mutual pruning approach for RkNN join processing(Datenbanksysteme für Business, Technologie und Web (BTW) 2016, 2013) Emrich, Tobias; Kröger, Peer; Niedermayer, Johannes; Renz, Matthias; Züfle, Andreas; Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, VeitA reverse k-nearest neighbour (RkNN) query determines the objects from a database that have the query as one of their k-nearest neighbors. Processing such a query has received plenty of attention in research. However, the effect of running multiple RkNN queries at once (join) or within a short time interval (bulk/group query) has, to the best of our knowledge, not been addressed so far. In this paper, we analyze RkNN joins and discuss possible solutions for solving this problem. During our performance analysis we provide evaluation results showing the IO and CPU performance of the compared algorithms for a variety of different setups.
- KonferenzbeitragEvenPers: event-based person exploration and correlation(Datenbanksysteme für Business, Technologie und Web (BTW) 2049, 2013) Kapp, Christian; Strötgen, Jannik; Gertz, Michael; Markl, Volker; Saake, Gunter; Sattler, Kai-Uwe; Hackenbroich, Gregor; Mitschang, Bernhard; Härder, Theo; Köppen, VeitSearching for people on the Internet is one of the most frequent search activities. In this paper, we present EvenPers, a system for the event-based exploration of persons and person similarities. We address challenges such as cross-document person name normalization and present a novel approach to calculate person similarities based on their event information. In our demonstration, we show several exploration scenarios illustrating the usefulness of EvenPers and its exciting functionality.