Finding Related Sentences in Multiple Documents for Multidocument Discourse Parsing of Brazilian Portuguese Texts

Priscila AleixoThiago Alexandre Salgueiro Pardo

Based on Cross-document Structure Theory (CST), we investigate the problem of finding related sentences from multiple documents on the same topic. We test some lexical similarity measures from related literature and improve them with language specific resources. The conclusions are that for Portuguese a different measure from English is the best one and that the knowledge resources we use affect the results in different ways.

