Logo des Repositoriums
 

Integrating Query-Feedback Based Statistics into Informix Dynamic Server

dc.contributor.authorBehm, Alexander
dc.contributor.authorMarkl, Volker
dc.contributor.authorHaas, Peter
dc.contributor.authorMurthy, Keshava
dc.contributor.editorKemper, Alfons
dc.contributor.editorSchöning, Harald
dc.contributor.editorRose, Thomas
dc.contributor.editorJarke, Matthias
dc.contributor.editorSeidl, Thomas
dc.contributor.editorQuix, Christoph
dc.contributor.editorBrochhaus, Christoph
dc.date.accessioned2020-02-11T13:22:13Z
dc.date.available2020-02-11T13:22:13Z
dc.date.issued2007
dc.description.abstractStatistics that accurately describe the distribution of data values in the columns of relational tables are essential for effective query optimization in a database management system. Manually maintaining such statistics in the face of changing data is difficult and can lead to suboptimal query performance and high administration costs. In this paper, we describe a method and prototype implementation for automatically maintaining high quality single-column statistics, as used by the optimizer in IBM Informix Dynamic Server (IDS). Our method both refines and extends the ISOMER algorithm of Srivastava et al. for maintaining a multidimensional histogram based on query feedback (QF). Like ISOMER, our new method is based on the maximum entropy (ME) principle, and therefore incorporates information about the data distribution in a principled and consistent manner. However, because IDS only needs to maintain one-dimensional histograms, we can simplify the ISOMER algorithm in several ways, significantly speeding up performance. First, we replace the expensive STHoles data structure used by ISOMER with a simple binning scheme, using a sweep-line algorithm to determine bin boundaries. Next, we use an efficient method for incorporating new QF into the histogram; the idea is to aggregate, prior to the ME computation, those bins that do not overlap with the new feedback records. Finally, we introduce a fast pruning method to ensure that the number of bins in the frequency distribution stays below a specified upper bound. Besides refining ISOMER to deal efficiently with one-dimensional histograms, we extend previous work by combining the reactive QF approach with a proactive sampling approach. Sampling is triggered whenever (as determined from QF records) actual and estimated selectivities diverge to an unacceptably large degree. Our combined proactive/reactive approach greatly improves the robustness of the estimation mechanism, ensuring very high quality selectivity estimates for queries falling inside the range of available feedback while guaranteeing reasonably good estimates for queries outside of the range. By automatically updating statistics, query execution is improved due to better selectivity estimates, and the total cost of ownership (TCO) is reduced since the database administrator need not update statistics manually for monitored columns.en
dc.identifier.isbn978-3-88579-197-3
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/31826
dc.language.isoen
dc.publisherGesellschaft für Informatik e. V.
dc.relation.ispartofDatenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS)
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-103
dc.titleIntegrating Query-Feedback Based Statistics into Informix Dynamic Serveren
dc.typeText/Conference Paper
gi.citation.endPage600
gi.citation.publisherPlaceBonn
gi.citation.startPage582
gi.conference.date07.-09.03.2007
gi.conference.locationAachen
gi.conference.sessiontitleRegular Research Papers

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
582.pdf
Größe:
463.29 KB
Format:
Adobe Portable Document Format