Using domain ontologies to help track data provenance

Renato FiletoCláudia Bauzer MedeirosLing LiuCalton PuEduardo Delgado Assad

Traditional techniques for tracking data provenance have difficulty adapting to the dynamics of the Web. This paper proposes a scheme for provenance estimation, based on domain ontologies. This scheme is part of the POESIA approach for multi-step integration of semi-structured data. The ontologies used for tracking provenance also help to describe, discover, reuse and integrate data and services. In contrast to traditional techniques, this scheme derives data provenance with fewer annotations at the extensional level and thus lower maintenance costs. Additionally, it promotes the use of ontologies to categorize and correlate scopes of data sets, thereby capturing the operational semantics of data integration processes.

