Priority queues for database query processing
Vorschaubild nicht verfügbar
ISSN der Zeitschrift
Gesellschaft für Informatik e.V.
Interesting orderings let sort-based query processing out-perform hash-based algorithms, but only tree-of-losers priority queues and offset-value coding permit competing in all cases including large unsorted inputs with large or complex keys. As long as this competition persists, alternative algorithms with equivalent functionality will plague query execution, e.g., in software maintenance and in query plan scheduling; and mistaken algorithm choices will plague query optimization, e.g., for joins, intersection, and grouping.After explaining tree-of-losers priority queues and offset-value coding, our work introduces necessary extensions for efficient run generation (in external merge sort) with variable-size records. The required changes in tree-of-losers priority queues support increasing and decreasing any key value at any time in logarithmic time, including incremental maintenance of offset-value codes, and with the expected time for key value increases independent of the size of the priority queue. As priority queues are widely used in all kinds of scheduling applications, our contributions go beyond database query processing. Double-ended priority queues are discussed in detail as they nicely illustrate the concepts.To the best of our knowledge, this is the first time that tree-of-losers priority queues have been extended to addressable priority queues and to non-monotonic sequences of input keys; and that offset-value coding has been extended to non-monotonic sequences of input keys. The proposed solutions and the included code snippets are simple, small, and fast, in contrast to the time and effort spent on bringing them to this state.