Broad Summer Scholar wins prestigious Intel science prize

Each year, well over a thousand promising high school science students enter the Intel Science Talent Search , long considered the nation’s most prestigious science competition. When all is said and done, only three take home top honors and the accompanying $150,000 prize. This year, one of those...

Each year, well over a thousand promising high school science students enter the Intel Science Talent Search, long considered the nation’s most prestigious science competition. When all is said and done, only three take home top honors and the accompanying $150,000 prize. This year, one of those exceptional students had close ties to the Broad: Andrew Jin, a 17-year-old senior from The Harker School in San Jose, CA, began his Intel journey while participating in the Broad Summer Scholars Program (BSSP).

Jin was placed in the BSSP program through its partnership with MIT’s Research Science Institute (RSI) and spent the summer in the lab of Broad institute member Pardis Sabeti, applying machine learning to evolutionary biology during a summer-long research project while also attending classes and programs with his fellow summer scholars.

I spoke with Jin and his mentor Joe Vitti, a graduate student in the Sabeti Lab, about his time at Broad and about the research that helped him take home the first place Medal of Distinction from the Intel competition.

Veronica Meade-Kelly: How did you both end up involved in the Broad Summer Scholars Program?

Intel Science Talent Search winner and former Broad Summer
Scholar Andrew Jin talked to the Wall Street Journal about
his research.

Andrew Jin: I was always interested in both evolutionary genetics and artificial intelligence, so I indicated my interest and passion for that when I applied to the Research Science Institute program at MIT. When I was accepted into the program, RSI matched me up with the Sabeti lab at the Broad Institute. Because RSI placed me at the Broad, I also got to participate in the Broad Summer Scholars Program.

Joe Vitti: I got involved in the program after RSI and BSSP circulated emails looking for mentors. Some postdocs in my lab encouraged me to do it. They had participated and had great experiences as mentors, so I decided to give it a shot.

VMK: And was Andrew’s project something the Sabeti lab already had in the works?

JV: It was an idea that we had been discussing for awhile. Our lab splits its time between evolutionary biology and infectious disease, and on the evolutionary biology side, we had floated the idea of using machine learning to identify specific genetic variants under natural selection. But, until now no one in our group had had the machine learning skill set or bandwidth to go all in on the project so, when we signed on with RSI, we mentioned our interest in someone with machine learning experience. They sent us Andrew and his background was perfect for it.

VMK: What was the project, in a nutshell?

AJ: Like Joe said, the Sabeti lab is really interested in studying the recent evolution of humans and they wanted to try to pinpoint exactly which mutations are under natural selection – like the allele that allows you to digest milk or a variant that confers resistance to disease. They had already developed a very successful method for doing that but that method still had its limitations, and the lab really wanted to try out a machine learning approach. Over the summer, my project was to create machine learning models – essentially training computers to analyze or look for hidden patterns in large, massive datasets. So, I was looking at these DNA sequences, each with three billion base pairs of DNA, from human beings living today – 179 people from Central Europe, East Asia, and West Africa – and I was looking for footprints left by natural selection. Then I trained machine learning models to do this job. Using this machine learning model, I ended up finding more than 100 really exciting mutations that may be adaptive.

VMK: And Joe said that the project required machine learning expertise that researchers at their Broad and Harvard labs didn’t possess… Andrew, how did you come by this experience as a high school student?

AJ: I have a pretty strong computer science background because I spent a lot of time in high school studying algorithms online and preparing and practicing for programming competitions. So, when I got interested in machine learning, the first thing I did was take a machine learning class online for free through Coursera. It’s a college-level course taught by Stanford professor Andrew Ng. I was able to pick up many of the basic concepts that way. Then I started doing research on my own – applying machine learning to real-life problems in biology and computational biology. A lot of the work I did was on cancer, like trying to predict whether or not a tumor will occur and how aggressive it might be from its gene expression data, or trying to predict what chemotherapy combinations might be most effective in treating patients. So I was able to apply these models to real problems, and in the process I read a lot of papers to pick up more advanced concepts that weren’t taught in the class.

VMK: When you were paired up to tackle this project last summer, what was the day-to-day process like and what were the keys to your success?

JV: The Sabeti lab splits time between the Broad and Harvard and I had to be at Harvard sometimes, so there would be some days when we didn’t check in at all. But we would touch base and he would let me know where he was and we would talk about the next steps. He would come to me with questions, but he was always incredibly prepared and had his questions organized and well thought out, so our meetings were always directed and efficient. He really made the most of the time that we had together.

AJ: I think one of the keys was setting milestones that I had to finish. For example, I knew a major part of the project was defining the features that might be predictive of mutations under natural selection, because those would be the variables the models looked at to determine whether a mutation was of interest or not. And I didn’t have much background on evolutionary genetics at the time, so I had to do a lot of research on my own and I had to rely on Joe to help me understand how the concepts worked. So I knew I had to define the features by a certain date, and had to finish training the models by a certain date, and then apply the models to empirical data by a certain date in order to finish this major part of my project over the summer. And I had to stay disciplined to make sure I adhered to the deadlines I set for myself.

VMK: And you continued the project after you left the Broad, right?

AJ: I think I got a good 60% done over the summer. I developed the whole machine learning tool, and then I applied those models to actually find new adaptive mutations in real-life DNA sequences. That was the main portion of the project and the most challenging part. Then when I got home I worked on the functional annotation – examining large, public datasets to try to figure out the meaning of these mutations that were predicted to be under selection and try to see what phenotypes and what evolutionary advantages they might confer.

JV: We continued to check in. He would give me an update and we would bounce ideas back and forth. But I feel like I can’t take too much credit for the project because Andrew really just ran with it. It was his initiative to gun it and keep going.

VMK: At what point did you decide to enter the Intel Science Talent Search?

AJ: When I started working on the project, I really just wanted to discover something that no one has ever known before, and also to develop a tool that was useful to the lab. But pretty much everyone in the summer program ends up submitting to Intel and the program organizers told us to discuss it with our mentor and the lab.

JV: I definitely remember him bringing it to my attention and I said “absolutely.” When he gave his final presentation to our group, everyone’s jaws were on the floor; the graduate student next to me commented that she had seen people defending their dissertations with less results than Andrew had. Just during that summer his project exceeded our expectations and the Intel competition seemed like a natural next step for him.

VMK: And what was the competition like?

AJ: The first round was almost like a big college application that I had to submit online. The main component was a 20-page research paper about the project that I did last summer. But I also had to write six essays – most of it was not related to my project – and submit my transcript and recommendations. From those applications, 300 of us were chosen as semi-finalists, and then those semi-finalists were further judged based on the quality of their application and research, and the top forty finalists were invited to Washington, DC for one week earlier this month. During that part of the competition, we had a series of five interviews with judges who were leaders from a variety of disciplines – chemistry, biology, math, computer science – and these interviews had nothing to do with my project at all. They asked about stuff I may have learned in high school, or they wanted us to explain how things worked.

JV: One of the examples Andrew told me about later over the phone was they asked him why the flame of a candle has that particular shape. It’s far afield from his project but that’s the sort of thing the students are expected to think through off-the-cuff.

AJ: Yeah, they wanted us to explain the scientific concepts behind these things you’d come across in daily life … and I thought that was pretty cool of course. I guess they ask those questions to evaluate how you think as a scientist and how you respond to unexpected problems. And after that there was a poster presentation and the judges asked questions about my research. At the end of the week, there was a big gala and that’s where I found out I was one of the winners.

VMK: On behalf of all of us here at Broad: Congratulations, Andrew!


For more information on the Broad Summer Scholars Program or the MIT Research Science Institute, visit the program websites. And, if you are affiliated with the Broad and would like to learn more about hosting a summer scholar, contact Rachel Gesserman at the Broad Institute Office for Education and Outreach.