Ipsa Mittra, a junior computer science major at the University of Maryland, College Park, used machine learning to examine structural variants.
Most commonly used cancer cell lines were derived from patient samples collected many years ago, but these cell lines rarely have paired normal samples. My experience at the Broad has not only opened my eyes to current cutting-edge research but the scope of what science will be in the future. The Broad environment fosters creativity, independence, and community. Not only did I feel supported by my mentors and other scientists, but I was also able to connect with my peers, fueled by the same passions as me. This opportunity has made me more enthusiastic about my career in biomedical science and the impact young scientists such as I have on creating a more diverse and inclusive research community.Without a matched normal, we cannot know which structural variants, or genetic rearrangements, in these samples are germline (inherited) or somatic (cancer-related). Our team has developed a computational approach, a linear support vector machine (SVM), to distinguish whether a structural variant is germline or somatic using training and testing data from a single structural variant caller. However, different structural variant callers rely on different assumptions and methods, and it is unknown how well our SVM performs using these alternate structural variant callers. To determine the performance of our SVM across structural variant callers, we evaluated our SVM on data from multiple callers, including Snowman and Manta. We found that our classifier performs best for the Snowman caller. Understanding the efficacy of our SVM for each combination of caller input for training and testing will inform us which callers we can use for our tool. Future applications include distinguishing germline and somatic structural variants in commonly used cancer cell lines and in panel sequencing data (a common method to clinically assess patients) which often don’t include matched normal data. These results will help better understand the role of structural variants in cancer and therapeutic response.
Project: Evaluating the efficiency of a germline structural variant classifier
Mentors: Simona Dalin and Sean Misek, Beroukhim Lab