Definite Descriptions in an Information Extraction Systems

Manuel PalomarRafael Muñoz

This paper presents an algorithm based on heuristic rules in order to solve Spanish definite description references. This algorithm is applied to an information extraction system for Spanish language. These heuristic rules are extracted from the study of an unrestricted corpus. This algorithm solves identity co-reference produced by a definite description whose relation with its antecedents can be solved with syntactic or semantic information. This module achieves a precision of 95.3% in classification task (anaphoric or non-anaphoric) and a average precision of 78% in

