Improving genome annotations using phylogenetic profile anomaly detection.

Bioinformatics
Authors
Keywords
Abstract

MOTIVATION: A promising strategy for refining genome annotations is to detect features that conflict with known functional or evolutionary relationships between groups of genes. Previous work in this area has been focused on investigating the absence of 'housekeeping' genes or components of well-studied pathways. We have sought to develop a method for improving new annotations that can automatically synthesize and use the information available in a database of other annotated genomes.

RESULTS: We show that a probabilistic model of phylogenetic profiles, trained from a database of curated genome annotations, can be used to reliably detect errors in new annotations. We use our method to identify 22 genes that were missed in previously published annotations of prokaryotic genomes.

AVAILABILITY: The method was evaluated using MATLAB and open source software referenced in this work. Scripts and datasets are available from the authors upon request.

CONTACT: tarjei@broad.mit.edu.

Year of Publication
2005
Journal
Bioinformatics
Volume
21
Issue
4
Pages
464-70
Date Published
2005 Feb 15
ISSN
1367-4803
URL
DOI
10.1093/bioinformatics/bti027
PubMed ID
15374867
Links