Mentor: Robert Riley, Genome Biology
Genomic sequencing allows researchers to better understand how organisms function on a molecular level and how genes interact with one another to produce phenotypic variations. This process can be applied to organisms such a Mycobacterium Tuberculosis (Mtb), the bacteria causing Tuberculosis (TB). The genomic sequence of Mtb has been available since 1998 and almost 10 years later, in spite of these discoveries, TB still causes over 1 million deaths a year. This drives research to further delve into the genomic data of Mtb to learn more about its biological pathways. Clustering of expression data provides groups of genes that exhibit patterns of co=expression or co-regulation; these may represent biological processes.
Luke and his Broad colleagues applied a biclustering algorithm (Bimax) and a traditional clustering algorithm (k-means) to a set of expression data. To benchmark the performance of each algorithm, they analyzed the tendency for clusters to contain genes in the same operon for each cluster. After evaluating the statistical significance of the operon recovery, they concluded that k-means clustering produced more biologically sound clusters than Biomax. Through this analysis, they hope to learn more about the global regulatory network of Mtb to possibly lead to a better understanding of the molecular basis for TB and assist with drug and vaccine discovery.
"Being at the Broad has given me an in-depth view of what cutting-edge research involves. I enjoyed being here because not only did I play a vital role in my research project, but I also became a part of the Broad community. While my research was fun and exciting, becoming part of this larger family has been very crucial and beneficial in continuing my research career."
Luke Yancy, Jr., a computer science and bioinformatics sophomore at Morehouse College, probed the systems biology of Mycobacterium tuberculosis using computer algorithms and gene expression data.