Local Feature Selection for Generation of Ensembles in Text Clustering

Ribeiro, M.N.Prudencio, R.B.C.

In the context of text clustering, global feature selection tries to identify a single subset of features which are relevant to all clusters. However, the clustering process might be improved by considering different subsets of features for locally describing each cluster. In experiments with local feature selection, it was observed that the resulting partitions were unstable but there were cohesive groups that did not occur in all executions. Based on this result, local feature selection was proposed to generate partitions to be used in ensemble clustering. New experiments were performed to evaluate the generated ensembles and a gain in precision was observed.

