You are here

Nucleic Acids Res DOI:10.1093/nar/gkt1078

MetaRef: a pan-genomic database for comparative and community microbial genomics.

Publication TypeJournal Article
Year of Publication2014
AuthorsHuang, K, Brady, A, Mahurkar, A, White, O, Gevers, D, Huttenhower, C, Segata, N
JournalNucleic Acids Res
IssueDatabase issue
Date Published2014 Jan
KeywordsArchaea, Bacteria, Databases, Genetic, Genome, Archaeal, Genome, Bacterial, Genomics, Internet, Metagenomics, Microbiota, Molecular Sequence Annotation, Multigene Family, Phylogeny

Microbial genome sequencing is one of the longest-standing areas of biological database development, but high-throughput, low-cost technologies have increased its throughput to an unprecedented number of new genomes per year. Several thousand microbial genomes are now available, necessitating new approaches to organizing information on gene function, phylogeny and microbial taxonomy to facilitate downstream biological interpretation. MetaRef, available at, is a novel online resource systematically cataloguing a comprehensive pan-genome of all microbial clades with sequenced isolates. It organizes currently available draft and finished bacterial and archaeal genomes into quality-controlled clades, reports all core and pan gene families at multiple levels in the resulting taxonomy, and it annotates families' conservation, phylogeny and consensus functional information. MetaRef also provides a comprehensive non-redundant reference gene catalogue for metagenomic studies, including the abundance and prevalence of all gene families in the >700 shotgun metagenomic samples of the Human Microbiome Project. This constitutes a systematic mapping of clade-specific microbial functions within the healthy human microbiome across multiple body sites and can be used as reference for identifying potential functional biomarkers in disease-associate microbiomes. MetaRef provides all information both as an online browsable resource and as downloadable sequences and tabular data files that can be used for subsequent offline studies.


Alternate JournalNucleic Acids Res.
PubMed ID24203705
PubMed Central IDPMC3964974
Grant ListP30 DK043351 / DK / NIDDK NIH HHS / United States
HHSN272200900018C / / PHS HHS / United States
R01HG005969 / HG / NHGRI NIH HHS / United States
U54HG004969 / HG / NHGRI NIH HHS / United States