Albert Xue, a senior at Duke University studying math and computer science, integrated multiple domains of experimental data to draw novel insight into genetic functionality.
Although the human genome has been successfully sequenced for almost two decades, the underlying function of many human genes is still unclear.
The Broad taught me how to organize my mind – how to question scientific hypotheses with rigor, and how to codify my intuition into clear and concise lines of thought. Above all, however, what I took away from the Broad was a deep sense of community ingrained into the structure of my project, my team, and the institute as a whole.This lack of clarity inhibits our understanding of human disease, especially in a disease like cancer, which is often driven by the effects of genetic mutations on various functional pathways. One powerful paradigm for discerning novel gene functionality is “guilt by association,” whereby already well-understood genes drive new insights into the functionalities of related, but less studied, genes. To use guilt by association as an exploratory mechanism for gene functionality, we use measurements of gene function from different domains, including unperturbed gene expression, post-gene knockout cancer cell viability, and other gene measurements, available through the Broad Institute’s Dependency Map project. We subsequently integrate these gene measurements and create a low-dimensional joint embedding of these data. This compact representation of diverse, high-dimensional experimental data provides support for downstream tasks such as classification, cluster analysis, or data visualization. We find that the synergistic effects of integrating multiple gene associates improves embedding quality, drives new functional insight, and allows us to create a more comprehensive understanding of genetic functionality in cancer.
Project: Mapping Gene Functionality with the Broad Institute’s Dependency Map
Mentor: Joshua Dempster, Cancer Data Science