Tratamento de Ruído Ortográfico em Indexação de Documentos baseado em Conhecimento e em Lógica Paraconsistente

Fabiano M. HasegawaBráulio C. ÁvilaEmerson L. dos SantosCelso A. A. Kaestner

Although several Information Retrieval tasks are carried out manually based on knowledge, automatic techniques ignore such advantage. Classic approaches for misspelling noise treatment consider only mathematical aspects. In this paper, a new knowledge-oriented indexing method is presented. Noise is treated still in the indices generation stage based on knowledge about term authentication, which combines experts' opinions with behaviours automatically acquired of characteristics observed in the collection. As the different opinions may be contradictory, Paraconsistent Logic's formalism of representation is used. The authentic terms which noise is referred to are recognized using a smooth lexical unification. The search engine, even without changes, becomes able to retrieve documents with terms of the query misspelled.

