Textdokument
A learning optimizer for a federated database management system
Vorschaubild nicht verfügbar
Volltext URI
Dokumententyp
Zusatzinformation
Datum
2005
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Verlag
Zusammenfassung
Optimizers in modern DBMSs utilize a cost model to choose an efficient query execution plan (QEP) among all possible ones for a given query. The accuracy of the cost estimates depends heavily on accurate statistics about the underlying data. Outdated statistics or wrong assumptions in the underlying statistical model frequently lead to suboptimal selection of QEPs and thus to bad query performance. Federated systems require additional statistics on remote data to be kept on the federated DBMS in order to choose the most efficient execution plan when joining data from different datasources. Wrong statistics within a federated DBMS can cause not only suboptimal data access strategies but also unbalanced workload distribution as well as unnecessarily high network traffic and communication overhead. The maintenance of statistics in a federated DBMS is troublesome due to the independence of the remote DBMSs that might not expose their statistics or use different models and not collect all statistics needed by the federated DBMS. We present an approach that extends DB2s learning optimizer to automatically find flaws in statistics on remote data by extending its query feedback loop towards the federated architecture. We will discuss several approaches to get feedback from remote queries and present our solution that utilizes local query feedback and remote query feedback, and can also trigger and drive iterative sampling of remote data sources to retrieve information needed to compute statistics profiles. We provide a detailed performance study and analysis of our approach, and demonstrate in a case study a potential query execution speedup in orders of magnitude while only incurring a moderate overhead during query execution.