DBM-Tree: Trading Height-Balancing for Performance in Metric Access Methods

Marcos R. VieiraCaetano Traina Jr.Fabio J. T. ChinoAgma J. M. Traina

Metric Access Methods (MAM) are employed to acceleratethe processing of similarity queries, such as therange and the k-nearest neighbor queries. Current methods,such as the Slim-tree and the M-tree, improve thequery performance minimizing the number of disk accesses,keeping a constant height of the structures storedon disks (height-balanced trees). However, the overlappingbetween their nodes has a very high influence ontheir performance. This paper presents a new dynamicMAM called the DBM-tree (Density-Based Metric tree),which can minimize the overlap between high-densitynodes by relaxing the height-balancing of the structure.Thus, the height of the tree is larger in denser regions, inorder to keep a tradeoff between breadth-searching anddepth-searching. An underpinning for cost estimation ontree structures is their height, so we show a non-heightdependable cost model that can be applied for DBM-tree.Moreover, an optimization algorithm called Shrink is alsopresented, which improves the performance of an alreadybuilt DBM-tree by reorganizing the elements among theirnodes. Experiments performed over both synthetic andreal world datasets showed that the DBM-tree is, in average,50% faster than traditional MAM and reduces thenumber of distance calculations by up to 72% and diskaccesses by up to 66%. After performing the Shrink algorithm,the performance improves up to 40% regarding thenumber of disk accesses for range and k-nearest neighborqueries. In addition, the DBM-tree scales up well,exhibiting linear performance with growing number of elementsin the database.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: