TSeg - A Text Segmenter for Corpus Annotation

Felipe RodriguesRichard SemoliniNorton Trevisan RomanAna Maria Monteiro

This paper describes TSeg - a Java application that allows for both manual and automatic segmentation of a source text into basic units of annotation. TSeg provides a straightforward way to approach this task through a clear point-and-click interface. Once finished the text segmentation, the application outputs an XML file that may be used as input to a more problem specific annotation software. Hence, TSeg moves the identification of basic units of annotation out of the task of annotating these units, making it possible for both problems to be analysed in isolation, thereby reducing the cognitive load on the user and preventing potential damages to the overall outcome of the annotation process.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: