A Spline-based Cost Model for Metric Trees

Marcos V. N. BedoAgma J. M. TrainaCaetano Traina Jr.

Whenever two (or more) access methods are alternatives for the ex-ecution of a query, how to choose which one is the best for the task? Such adecision is made by the DBMS optimizer module, which models the query costsaccording to the distribution of the data space. Cost modeling of similaritysearches, however, requires the representation of distances' rather than datadistribution. In this paper, we propose the Stockpile model for cost estimation ofsimilarity queries on metric trees by using pivot-based distance histograms thatrepresent the local densities around the query elements. By combining the lo-cal densities to the probability of traversing the tree nodes, Stockpile provides afair estimation of both disk accesses (I/O costs) and distance calculations (CPUcosts). We compared Stockpile and two literature models regarding similarityqueries in real-world data sources and our model was up to 85% more precisethan the competitors.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: