Tests of Automatic Annotation Using KOG Proteins and ESTs From 4 Eukariotic Organisms

Maurício de Alvarenga MudadoEstevam Bravo-NetoJosé Miguel Ortega

BLAST homology searches have been largely used to annotate function to novel sequences. Secondary databases like KOG can be used in this intention since their sequences have functional classification. We devised an experiment where public ESTs from four eukariotic organisms, which protein sequences are present in the KOG database, are classified to functional KOG categories using tBLASTn. First we assigned the ESTs from one organism to KTL (KOG, TWOG and LSEs) proteins and then we searched the database depleted of the same organism's proteins to simulate a novel transcriptome. Data show that classification was correct (assignment equals annotation) 87.2%, 96.8%, 92.0%, 88.7% for A. thaliana(Ath), C. elegans(Cel), D. melanogaster(Dme) and H. sapiens(Hsa) respectively. We have estimated identity cutoffs for all organisms to use with tBLASTn. These cutoffs trim the same amount of events that a BLASTn in order to minimize false positives in consequence of sequence errors. We found values of 80%, 78%, 78% and 84% for amino-acid identity cutoff for Hsa, Dme, Cel and Ath, respectively. We then evaluated our system by comparing the KTL categories of the assigned ESTs with the KTL categories that the ESTs were classified without the organism's KTL proteins. Moreover, we show the potential of annotation of the KOG database and the ESTs used. Suplementary Information can be found at:

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: