A Multi­view Approach for Semi­Supervised Scientific Paper Classification

Víctor A. LagunaAlneu de Andrade Lopes

In this paper we show that combining information from a citation­-based network with the traditional bag of words representation in a semi­-supervised framework like Cotraining significantly improves scientific paper classification accuracy. We carried out experiments showing that the examples labeled by the classifier based on the citation network representation and used to increment the training set are mostly correctly labeled. This fact contributes to improve the overall accuracy of cotraining classifiers, even when the citation­-based classifier, separately, is not as accurate as the classifier based on the bag of words. The results suggest that citation network information significantly improves the performance of the classifiers, mainly when labeled instances are scarce.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: