Bioinformatics DOI:10.1093/bioinformatics/bti027

Improving genome annotations using phylogenetic profile anomaly detection.

Publication TypeJournal Article
Year of Publication2005
AuthorsMikkelsen, TS, Galagan, JE, Mesirov, JP
Date Published2005 Feb 15
KeywordsAlgorithms, Bayes Theorem, Chromosome Mapping, DNA Mutational Analysis, Evolution, Molecular, Gene Expression Profiling, Genomics, Models, Genetic, Models, Statistical, Phylogeny, Sequence Alignment, Sequence Analysis, DNA

MOTIVATION: A promising strategy for refining genome annotations is to detect features that conflict with known functional or evolutionary relationships between groups of genes. Previous work in this area has been focused on investigating the absence of 'housekeeping' genes or components of well-studied pathways. We have sought to develop a method for improving new annotations that can automatically synthesize and use the information available in a database of other annotated genomes.

RESULTS: We show that a probabilistic model of phylogenetic profiles, trained from a database of curated genome annotations, can be used to reliably detect errors in new annotations. We use our method to identify 22 genes that were missed in previously published annotations of prokaryotic genomes.

AVAILABILITY: The method was evaluated using MATLAB and open source software referenced in this work. Scripts and datasets are available from the authors upon request.



Alternate JournalBioinformatics
PubMed ID15374867