You are here

Nucleic Acids Res DOI:10.1093/nar/gkt654

High-resolution definition of the Vibrio cholerae essential gene set with hidden Markov model-based analyses of transposon-insertion sequencing data.

Publication TypeJournal Article
Year of Publication2013
AuthorsChao, MC, Pritchard, JR, Zhang, YJ, Rubin, EJ, Livny, J, Davis, BM, Waldor, MK
JournalNucleic Acids Res
Date Published2013 Oct
Keywords5' Untranslated Regions, DNA Transposable Elements, Escherichia coli, Gene Library, Genes, Bacterial, Genes, Essential, Genetic Loci, High-Throughput Nucleotide Sequencing, Markov Chains, RNA, Untranslated, Sequence Analysis, DNA, Vibrio cholerae

The coupling of high-density transposon mutagenesis to high-throughput DNA sequencing (transposon-insertion sequencing) enables simultaneous and genome-wide assessment of the contributions of individual loci to bacterial growth and survival. We have refined analysis of transposon-insertion sequencing data by normalizing for the effect of DNA replication on sequencing output and using a hidden Markov model (HMM)-based filter to exploit heretofore unappreciated information inherent in all transposon-insertion sequencing data sets. The HMM can smooth variations in read abundance and thereby reduce the effects of read noise, as well as permit fine scale mapping that is independent of genomic annotation and enable classification of loci into several functional categories (e.g. essential, domain essential or 'sick'). We generated a high-resolution map of genomic loci (encompassing both intra- and intergenic sequences) that are required or beneficial for in vitro growth of the cholera pathogen, Vibrio cholerae. This work uncovered new metabolic and physiologic requirements for V. cholerae survival, and by combining transposon-insertion sequencing and transcriptomic data sets, we also identified several novel noncoding RNA species that contribute to V. cholerae growth. Our findings suggest that HMM-based approaches will enhance extraction of biological meaning from transposon-insertion sequencing genomic data.


Alternate JournalNucleic Acids Res.
PubMed ID23901011
PubMed Central IDPMC3799429
Grant ListAI R37-42347 / AI / NIAID NIH HHS / United States
T32 GM007753 / GM / NIGMS NIH HHS / United States
F32 GM108355 / GM / NIGMS NIH HHS / United States
T32 AI007638 / AI / NIAID NIH HHS / United States
R37 AI042347 / AI / NIAID NIH HHS / United States
HHSN272200900018C / / PHS HHS / United States
/ / Howard Hughes Medical Institute / United States
F32 GM108355-01 / GM / NIGMS NIH HHS / United States