LIHLA: A lexical aligner based on language-independent heuristics

Helena de Medeiros CaseliMaria das Graças Volpe NunesMikel L. Forcada

Alignment of words and multiword units plays an important role in many natural language processing applications, such as example-based machine translation, transfer rule learning for machine translation, bilingual lexicography, word sense disambiguation, etc. In this paper we describe LIHLA, a lexical aligner which uses bilingual probabilistic lexicons generated by a freely available set of tools (NATools) and language-independent heuristics to find links between single words and multiword units in Brazilian Portuguese, Spanish and English parallel texts. The method has achieved a precision of 92.48% and 84.35% and a recall of 88.32% and 76.39% on Brazilian Portuguese–Spanish and Brazilian Portuguese–English parallel texts, respectively.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: