Auflistung nach Autor:in "Sauer, Caetano"
1 - 5 von 5
Treffer pro Seite
Sortieroptionen
- ZeitschriftenartikelCompilation of Query Languages into MapReduce(Datenbank-Spektrum: Vol. 13, No. 1, 2013) Sauer, Caetano; Härder, TheoThe introduction of MapReduce as a tool for Big Data Analytics, combined with the new requirements of emerging application scenarios such as the Web 2.0 and scientific computing, has motivated the development of data processing languages which are more flexible and widely applicable than SQL. Based on the Big Data context, we discuss the points in which SQL is considered too restrictive. Furthermore, we provide a qualitative evaluation of how recent query languages overcome these restrictions. Having established the desired characteristics of a query language, we provide an abstract description of the compilation into the MapReduce programming model, which, up to minor variations, is essentially the same in all approaches. Given the requirements of query processing, we introduce simple generalizations of the model, which allow the reuse of well-established query evaluation techniques, and discuss strategies to generate optimized MapReduce plans.
- ZeitschriftenartikelInstant recovery with write-ahead logging(Datenbank-Spektrum: Vol. 15, No. 3, 2015) Härder, Theo; Sauer, Caetano; Graefe, Goetz; Guy, WeyInstant recovery improves system availability by reducing the mean time to repair, i.e., the interval during which a database is not available for queries and updates due to recovery activities. Variants of instant recovery pertain to system failures, media failures, node failures, and combinations of multiple failures. After a system failure, instant restart permits new transactions immediately after log analysis, before and concurrent to “redo” and “undo” recovery actions. After a media failure, instant restore permits new transactions immediately after allocation of a replacement device, before and concurrent to restoring backups and replaying the recovery log.Write-ahead logging is already ubiquitous in data management software. The recent definition of single-page failures and techniques for log-based single-page recovery enable immediate, lossless repair after a localized wear-out in novel or traditional storage hardware. In addition, they form the backbone of on-demand “redo” in instant restart, instant restore, and eventually instant failover. Thus, they complement on-demand invocation of traditional single-transaction “undo” or rollback.In addition to these instant recovery techniques, the discussion introduces self-repairing indexes and much faster offline restore operations, which impose no slowdown in backup operations and hardly any slowdown in log archiving operations. The new restore techniques also render differential and incremental backups obsolete, complete backup commands on a database server practically instantly, and even permit taking full up-to-date backups without imposing any load on the database server.
- TextdokumentModern techniques for transaction-oriented database recovery(BTW 2019, 2019) Sauer, CaetanoTransaction-oriented database recovery has been a “solved problem” for at least 25 years since the introduction of the ARIES methods for logging and recovery. However, recent technological developments have urged the need for new software architectures that can better exploit the efficiency of modern hardware. In the context of recovery, new algorithms are required to effectively accommodate the exponential decrease in main-memory cost, the advent of flash memory, the rapid expansion into many-core CPUs, the ever-increasing capacity of magnetic disks, and, on the long term, the potential adoption of non-volatile memory. In our research, we evaluated a variety of new software techniques for efficient transaction-oriented database recovery, focusing on availability and architectural simplicity. The techniques presented here differ from most recent work in the field in which they aim to be hardware-agnostic, supporting different memory and storage configurations with the same software, as well as fully functional in comparison with traditional database systems, e.g., by supporting media recovery, index management, larger-than-memory datasets, and arbitrary access structures with structural modifications.
- KonferenzbeitragSingle-pass restore after a media failure(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Sauer, Caetano; Graefe, Goetz; Härder, TheoWhen persistent storage fails, traditional media recovery first restores an old backup image followed by replaying the recovery log since the last backup operation. Restoring a backup can take hours, but log replay often takes much longer due to its random access pattern. We introduce single-pass restore, a technique in which restoration of all backups and log replay are performed in a single operation. This allows hiding log replay within the initial restore of the backup, thus substantially reducing the time and cost of media recovery and, incidentally, rendering incremental backup techniques unnecessary. Single-pass restore is enabled by a new organization of the log archive, created by a continuous process that is easily incorporated into the traditional log archiving process. Our empirical analysis shows that the imposed overhead is negligible in comparison with the substantial benefits provided.
- ZeitschriftenartikelUnleashing XQuery for Data-Independent Programming(Datenbank-Spektrum: Vol. 14, No. 2, 2014) Bächle, Sebastian; Sauer, CaetanoThe XQuery language was initially developed as an SQL equivalent for XML data, but its roots in functional programming make it also a perfect choice for processing almost any kind of structured and semi-structured data. Apart from standard XML processing, however, advanced language features make it hard to efficiently implement the complete language for large data volumes. This work proposes a novel compilation strategy that provides both flexibility and efficiency to unleash XQuery’s potential as data programming language. It combines the simplicity and versatility of a storage-independent data abstraction with the scalability advantages of set-oriented processing. Expensive iterative sections in a query are unrolled to a pipeline of relational-style operators, which is open for optimized join processing, index use, and parallelization. The remaining aspects of the language are processed in a standard fashion, yet can be compiled anytime to more efficient native operations of the actual runtime environment. This hybrid compilation mechanism yields an efficient and highly flexible query engine that is able to drive any computation from simple XML transformation to complex data analysis, even on non-XML data. Experiments with our prototype and state-of-the-art competitors in classic XML query processing and business analytics over relational data attest the generality and efficiency of the design.