Wilson dos S. Batista JuniorLucia H. M. Rino

Uma Abordagem Top-Down para Geração das Correspondências entre XML Schemas Semânticos

Successfully retrieving a web document is a twofold problem: having an adequate query that can usefully and properly help filtering relevant documents from huge collections, and presenting the user those that will indeed fulfill his/her needs. In this paper, we focus on the first issue - the problem of having a misleading user query. The aim of the work is to refine a query by using extracts instead of full documents. Extracts of the documents of a hitlist are built by GistSumm, an extractive automatic summarizer based on the gist of a document. Automatic summarization of single and multi-documents is explored. Results on pseudo-relevance feedback for the Portuguese CHAVE collection show that gist-based extracts may improve IR.

