The Influence of Noisy Patterns in the Performance of Learning Methods in the Splice Junction Recognition Problem

Ana Carolina LorenaGustavo E. A. P. A. BatistaAndré Carlos Ponce de Leon Ferreira de CarvalhoMaria Carolina Monard

Since the beginning of the Human Genome Project, which aims at sequencing all the human?s genetic informa-tion, a large amount of sequence data has been generated. Much attention is now given to the analysis of this data. A great part of these analysis is carried out with the use of intelligent computational techniques. However, many of the genetic databases are characterized by the presence of noisy data, which can deteriorate the performance of the computational techniques applied. This work studies the influence of noisy data in the training of three different learning methods: Decision Trees, Artificial Neural Networks and Support Vector Machines. The task investigated is the recognition of splice junctions in DNA sequences, which is part of the gene identification problem. Results indicate that the elimination of noisy patterns from the dataset can improve the learning algorithms? performance, with no significant reduction in their generalization ability.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: