You are here

BMC Genomics DOI:10.1186/1471-2164-13-734

How deep is deep enough for RNA-Seq profiling of bacterial transcriptomes?

Publication TypeJournal Article
Year of Publication2012
AuthorsHaas, BJ, Chin, M, Nusbaum, C, Birren, BW, Livny, J
JournalBMC Genomics
Date Published2012 Dec 27
KeywordsEscherichia coli, Gene Expression Profiling, Gene Library, High-Throughput Nucleotide Sequencing, Molecular Sequence Annotation, Open Reading Frames

BACKGROUND: High-throughput sequencing of cDNA libraries (RNA-Seq) has proven to be a highly effective approach for studying bacterial transcriptomes. A central challenge in designing RNA-Seq-based experiments is estimating a priori the number of reads per sample needed to detect and quantify thousands of individual transcripts with a large dynamic range of abundance.

RESULTS: We have conducted a systematic examination of how changes in the number of RNA-Seq reads per sample influences both profiling of a single bacterial transcriptome and the comparison of gene expression among samples. Our findings suggest that the number of reads typically produced in a single lane of the Illumina HiSeq sequencer far exceeds the number needed to saturate the annotated transcriptomes of diverse bacteria growing in monoculture. Moreover, as sequencing depth increases, so too does the detection of cDNAs that likely correspond to spurious transcripts or genomic DNA contamination. Finally, even when dozens of barcoded individual cDNA libraries are sequenced in a single lane, the vast majority of transcripts in each sample can be detected and numerous genes differentially expressed between samples can be identified.

CONCLUSIONS: Our analysis provides a guide for the many researchers seeking to determine the appropriate sequencing depth for RNA-Seq-based studies of diverse bacterial species.


Alternate JournalBMC Genomics
PubMed ID23270466
PubMed Central IDPMC3543199
Grant ListAI-076608 / AI / NIAID NIH HHS / United States
HHSN272200900018C / / PHS HHS / United States