A Constraint Grammar Parser for Spanish

Eckhard Bick

In this paper we describe and evaluate a Constraint Grammar parser for Spanish, HISPAL. The parser adopts the modular architecture of the Portuguese PALAVRAS parser, and in a novel porting approach, the linguist­ written Portuguese CG rules for morphological and syntactic disambiguation were "corrected" and appended for Spanish in a corpus­-based fashion, rather than rewritten from scratch. As part of the 5 year project, a 74.000 lexeme lexicon was developed, as well as a morphological analyzer and semantic ontology for Spanish. An evaluation of the the system's tagger/parser modules indicated F­-scores of 99% for part­-of­-speech tagging and 96% for syntactic function assignment. HISPAL has been used for the grammatical annotation of 52 million words of text, including the Europarl and Wikipedia text collections.

