Arlindo Veiga, Sara Candeias, Fernando Perdigão.
This paper addresses the problem of grapheme to phoneme conversion in order to create a pronunciation dictionary from a vocabulary of the most frequent words in European Portuguese. A system based on a mixed approach funded on a stochastic model with embedded rules for stressed vowel assignment is described. The model can generate pronunciations from unrestricted words; however, a dictionary with the 40k most frequent words was constructed and corrected interactively. The vocabulary was definedusing the CETEMPúblico corpus. The model and dictionary are publicly available.
http://www.lbd.dcc.ufmg.br/colecoes/stil/2011/0016.pdf
Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web