Multi­label Website Classification via MDL without Closed World Assumption

Rodrigo R. OrmondeMarcelo Ladeira

The machine­learning approach to websites classification belongs to the class of multi­label problems, i.e., a single document can be labeled with more than one category, which is the harder and less studied class. This article proposes a new algorithm, based on the Minimum Description Length principle and on the Adaptive Huffman coding, which can be used to perform multi­label classification of textual documents in general, with or without closed world as­sumption. This allows documents to be labeled, with one, several or no category. The results show the potential of this novel algorithm.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: