Machine Learning Algorithms for Portuguese Named Entity Recognition

Ruy Luiz MilidiúJulio Cesar DuarteRoberto Cavalcante

Named Entity Recognition (NER) is an important task in Natural Language Processing. It provides key features that help on more elaborated document management and information extraction tasks. In thispaper, we propose seven machine learning approaches that use HMM, TBL and SVM to solve Portuguese NER. The performance of each modeling approach is empirically evaluated. The SVM-based extractor shows a 88.11% F-score, which is our best observed value, slightly better than TBL. This is very competitive when compared to state-of-the-art extractors for similar Portuguese NER problems. Our HMM has reasonable precision and accuracy and does not require any additional expert knowledge. Thisis an advantage for our HMM over the other approaches. The experimental results suggest that Machine Learning can be useful in Portuguese NER. They also indicate that HMM, TBL and SVM perform well in this natural language processing task.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: