Parsing Expression Grammars for Structured Data

Fabio MascarenhasSergio MedeirosRoberto Ierusalimschy

Parsing Expression Grammars (PEGs) are a formalism for language recognition that renewed academic interest in top-down parsing approaches. LPEG is an implementation of the PEG formalism which compiles PEGs to instructions of a virtual parsing machine, preserving PEG semantics. The LPEG parsing machine has a formal model, and the transformation of PEGs to this model has been proven correct. In this paper, we extend both the PEG formalism and LPEG's parsing machine so they can match structured data instead of just strings of symbols. Our extensions are conservative, and we prove the correctness of the translation from extended PEGs to programs of the extended parsing machine. We also present benchmarks that show that the performance of the extended parsing machine for structured data is competitive with the performance of equivalent hand-written parsers.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: