A Clustering Strategy to Find Similarities in Mycoplasma Promoters

João Francisco ValiatiPaulo Martins Engel

This paper presents a neural network clustering strategy to identify regularities in a dataset of Mycoplasma promoter sequences. The traditional way that prokaryotic promoters are identified is proven inadequate to the Mycoplasma family. Our clustering approach tries to discover regularities in base pair compositions of the dataset sequences to give clues which indicate the presence or absence of promoters. Several experiments with leave-one-out strategy and a negative dataset revealed a best way to fit model parameters. Preliminary results are promising for creating a computational model able to find promoter regions in Mycoplasmas.

