Tornando Fuzzy C-Means Escal¥avel para Bancos de Dados Arbitrariamente Grandes

Thiago CordeiroGerson Zaverucha

Domingos and Hulten developed a general framework to scale up machine learning algorithms and applied to K-Means clustering. In this work, we adapt this method to scale up the Fuzzy C-Means (FCM) to arbitrarily large datasets. This adaptation is not straightforward because, as opposed to K-Means, the FCM learner's error is not a function of the number of examples owned by the clusters. Instead, each cluster is associated with every example through the membership matrix. In this way we developed the Very Fast Fuzzy C-Means (VFFCM), a clustering algorithm that uses a minimum number of examples (determined theoretically by Hoeffding bound) in each of its steps to guarantee that the resulting model does not differ significantly from the one that would be created passing the entire data through the FCM.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: