Usando critérios de qualidade para materialização seletiva de dados

Maria da Conceição Moraes BatistaAna Carolina Salgado

Data integration systems are planned to offer uniform access to data from heterogeneous and distributed sources. Two classical data integration architectures have been proposed in the literature. In the materialized approach, data are stored, integrated and accessed directly from a data warehouse. In the virtual approach, the queries posed to the integration system are decomposed into queries addressed directly to the sources. This paper presents a data integration environment with hybrid architecture. It is based in the virtual approach with additional resources for selectively materialize data in a data warehouse. Other distinguished feature of the environment is the use of a cache system in to answer the most frequently asked queries. The materialization process implies in analyzing a set of quality and cost criteria associated with the data in order to determine if the materialization will improve time response gains and minimize maintenance cost of the data warehouse. The selected data for materialization are more static and unavailable data obtained from sources. In an existing virtual-based architecture for data integration we have inserted software modules for management of the data warehouse and cache, and queries processing under three ways: accessing the data sources (virtuals), accessing the data warehouse (materialized) and faster accessing the cache contents. All these resources are put together with the goal of optimizing the overall query response time.

