Detecção de Spams Utilizando Conteúdo Web Associado a Mensagens

Marco Túlio C. RibeiroLeonardo Vilela TeixeiraPedro H. Calais GuerraAdriano VelosoWagner Meira Jr.Dorgival GuedesCristine HoepersKlaus Steding-JessenMarcelo H. P. C. Chaves

In this paper we propose a strategy of spam classification that exploitsthe content of the Web pages linked by e-mail messages. We describe a methodologyfor extracting pages linked by spam and we characterize the relationshipamong those pages and the spam messages. We then use a machine learningalgorithm to extract features found in the web pages that are relevant to spamdetection. We demonstrate that the use information from linked pages can significantlyoutperforms current spam classification techniques, as portrayed bySpam Assassin. Our study shows that the pages linked by spams are a very promisingbattleground, where spammers do not hide their identity, and that thisbattleground has not yet been used by spam filters.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: