Eduardo R. Hruschka, Estevam R. Hruschka Jr., Nelson F. F. Ebecken.
This paper presents a Nearest-Neighbor Method to substitute missing values in continuous datasets and show that it can be useful for a Clustering Genetic Algorithm. The proposed method is evaluated by means of simulations performed in the Wisconsin Breast Cancer Dataset, which is a benchmark for data mining methods. In this sense, we verify the efficacy of the proposed method in the context of a Clustering Genetic Algorithm, comparing the average classification rates obtained in the original dataset with those obtained in a dataset formed by the substituted values. The simulation results show that the proposed method is promising.
http://www.lbd.dcc.ufmg.br:8080/colecoes/sbbd/2003/paper024.pdf
Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web