Evaluating the accuracy of the classification of the 16S rRNA across different classifiers, regions of the sequence, and phylogeny will support the Human Microbiome Project in the targeted sequence analysis of 16S rRNA genes using more cost-effective and efficient sequencing technologies. This will assist the HMP in discovering the role that microorganism communities play in human development, physiology, immunity, disease, and nutrition.
Christel and her Broad colleagues identified optimal targets for directed sequencing using new sequencing technologies by assessing the sequence-based taxonomic classification of localized regions within 16S, accessed by scanning a window along a diverse set of full-length sequences and computing a classification based on the windowed sequence using three distinct classifiers: the RDP-classifier (naïve Bayes), kmerRank, and BLAST. The classifications of the windowed sequence were then compared to the classifications of the full-length sequences. A wide breadth of the bacterial phylogeny was used to compare the consistency of the different classifiers and the effectiveness of the specific 16S subsets. Additionally, the accuracy across different taxonomic groups was analyzed by partitioning the results based on phylum, class, etc., to determine if certain taxa were more reliably classified than others.
PROJECT: Computational analysis of the taxonomical classification of short 165 rRNA sequences
My summer at the Broad Institute was wonderful. I was able to perform cutting-edge research in a unique and collaborative environment. My mentor taught me so much and helped me throughout the entire summer. After this summer, my interest in research, especially computational biology, has grown significantly. I now know that this is the career I wish to pursue in the future.