Extracting and Searching Useful Information Available on Web FAQs

Edson OliveiraAltigran S. da SilvaEdleno Silva de MouraJoão M. B. Cavalcanti

This paper presents new methods for structuring and searching for information stored on Web FAQs. These methods are based on the assumption that such pages are implicitly organized as a set of question-answer pairs (QAPs). The ultimate goal is to improve the retrieval of answers available in FAQs for queries that can be answered using the information contained in them. More specifically, we propose modifications for three of the main tasks performed by search engines: crawling, indexing and query processing. To evaluate our proposed methods, we used a large collection of documents from a real search engine. We present the results of this evaluation for each of the three tasks.

