Bacterial genomes consist of regions that are transcribed into RNA (comprising the transcriptome) and regions that are not. Fully characterizing the transcriptome would give researchers a powerful way to pinpoint transcripts that relate to bacterial phenotypes, including those present in pathogenic processes (e.g., M. tuberculosis). RNA-seq, the high-throughput sequencing of cDNA libraries, allows us to build a map of bacterial transcriptomes by overlapping millions of individual reads. Currently there are no methods for normalizing variation between high and low expressed genes and suppressing noise from short read lengths, read error, coverage error and the dense nature of bacterial genomes. This renders the parsing out of all but the mostly highly expressed transcripts difficult. Using E. coli RNA-seq data, we have developed an algorithm that scans RNA-seq expression data, identifying transcript locations. Our most recent algorithm has achieved low to moderate sensitivity in identifying E. coli start-stops and has achieved moderate precision by identifying a super majority of start-stops, accurate to within 200 base pairs. To increase precision and sensitivity, we aimed to create several more normalized variables using expression levels of the E. coli genome. We hope to significantly increase the precision of start-stop site identification, thereby generating a low percent of false positives. This might serve as a novel algorithm for deconstructing transcriptome data in a variety of bacterial pathogens and other microbes.
PROJECT: Mapping Bacterial Transcriptomes Using RNA-seq Expression Data
Before coming to the Broad, I saw communication skills more or less as fluff. But if I’ve learned anything from my experience this summer, it’s that the only thing more important than doing the science you love is being able to explain to others why they should love it, too. The Broad has given me a toolkit for effective scientific communication and shown me how those skills can open doors I didn’t even know were there.