Explorando arquiteturas multi-core para processamento eficiente de consultas em sistemas de gerência de Big Data

Frank W. R. da SilvaVictor T. de AlmeidaVanessa Braganholo

Big Data Management Systems usually manage each machine as onenode in parallel query processing pipeline. In multi-core architectures, theyleave several processor cores aside that could contribute to speed-up queryprocessing. In this context, this paper explores the use of all available processorcores, assessing the query processing performance in several scenarios. Inparticular, we use the concept of worker nodes (which are allocated in coreswithout disk access) and data nodes (which are allocated in cores with diskaccess) in the same machine using the MyriaX engine as a base platform thatsupports this concept. We evaluate several cluster configurations varying theamount of data and worker nodes to process two types of queries (self-join andtriangle) in a Twitter dataset. The results show that increasing the I/Oparallelism in terms of data nodes is not always the most effective strategy. Thisreinforces the idea of using worker nodes in the query processing pipeline. Inthe best scenario, we achieved a speed-up of 2.92 by simply adding workernodes in the available processing cores.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: