Andrew Cherniack mines tumor data for clues

In a #WhyIScience Q&A, the longtime group leader from the Broad’s Cancer Program recounts his move from molecular to computational biology and talks about the collaborative effort that went into The Cancer Genome Atlas

Andrew Cherniack.
Credit: Photo courtesy of Broad Communications.
Andrew Cherniack.

Andrew Cherniack comes from a family of scientists. He married a scientist. And he’s been working in scientific research labs since he was still in high school.

Over the years, Cherniack has witnessed a sea-change in his “home” field of genetics and his own work has changed with the times. Cherniack spent the first 20 years of his career doing “wet lab” work, conducting scientific experiments on cells and tissues in a traditional research laboratory. But his work also involved looking at genome sequences—a process that increasingly required computational power as the amount of data available for analysis rapidly grew. In time, he found himself increasingly using new math- and computer-based tools in his biological investigations.

Cherniack now works as the group leader of computational biology for the Broad Institute’s Cancer Program, where he not only applies math to the cancer genomics datasets available for cancer tumor research, but has also been a key player in building some of the largest datasets generated to date.

In a #WhyIScience Q&A, he tells us about his work:

Q: What do you do at the Broad?
A:
For most of the time I’ve been here, a big part of my job has involved working on The Cancer Genome Atlas (TCGA), which was a huge effort to map all of the genetic alterations in 33 different tumor types from over 10,000 different patients. In particular, our team was responsible for looking at the copy number alterations in genes in these tumors.

One of the things that happens in cancer is some of the tumor cell’s DNA gets amplified, while other parts get deleted. Within amplified DNA regions there are often oncogenes—genes that drive growth of cells. And the deleted DNA sequence will often contain tumor suppressors—genes that would have helped keep cell growth in check. We made a map—TCGA—of all of the copy number changes in all of these tumors, and then we did an analysis to understand what’s driving each of these particular tumors.

The second half of my job here at the Broad involves using this huge cancer genome atlas with one of our major pharmaceutical collaborators to discover new targets for cancer drugs—particularly genes that we found through TCGA that may play a role in cancer that hasn’t previously been appreciated.

Q: How did you get into your line of work?
A:
I’m not a computational biologist by training. I have a PhD in molecular genetics, which isn’t the usual path. Until recently, computational biologists tended to come through math, physics, or computational sciences. But I started out doing wet lab work—no different than other biologists were doing—but I was doing a lot of sequencing and I was also involved in some of the early days of microarray research. We were looking at whole transcriptomes across genomes, and through that I gradually became more and more involved in doing computational work. When I came to Broad nine years ago, it was the first job I had where all of my work is computationally based. My associates and I have no pipetting responsibilities.

Q: Is there a cultural difference between wet lab work and computational work?
A:
There definitely is. I think many computational scientists approach the work we do from a mathematical perspective, where they’re used to dealing with equations, so it’s precise and can be easy to replicate. But biology is kind of messy. People who work at a wet lab know that. You do an experiment—sometimes it works, sometimes it doesn’t. There may be many things that you need to account for to get experiments to work consistently.

To do computational biology you have to understand both aspects. You have to know, yes, there’s a lot to gain by applying algorithms and equations to biological data, but you also have to understand that biological research isn’t always clean and clear-cut.

It is great to see more people coming into the field of computational biology with both biology and computational backgrounds. It takes one set of skills to write an algorithm and run an analysis, but it’s equally important to be able to recognize what’s meaningful in the analysis’ output. Having both backgrounds enables you to understand the entire picture.

Andrew Cherniack with team members from Broad's Cancer ProgramAndrew Cherniack, group leader of the Broad Cancer Program’s computational biology team, with colleagues Ashton Berger, Lindsay Westlake, and Liam Spurr.

Q: What’s the latest trend in your field?
A:
Over the last decade, we spent all this time mapping out all the alterations in cancer and we’ve learned a lot. We’ve discovered new oncogenes and tumor suppressors. We discovered all sorts of new mutational mechanisms. We discovered mechanisms of oncogenesis that we’d never even thought of before. But now the question is, how can we translate that into treatments? Or, can we better understand who responds best to which treatments? That’s the major thing that we’re moving toward.

There are groups inside and outside of the Broad working on that. The Metastatic Breast Cancer (MBC) Project is one example. They’re collecting genetic and clinical data from cancer patients and trying to figure out who’s going to respond to what sort of treatment. That’s the real future right now. Can we really figure out what sort of therapy is best for a patient based on their specific genetic alterations?

Q: What’s the biggest challenge for you professionally right now?
A:
Making the data widely available and easily accessible for researchers globally while, at the same time, protecting patient privacy. There are a lot of hurdles in doing that. But again, I think the MBC Project is a good example that we can get these things done.

Q: Is that a challenge because the practice of sharing this sort of data is relatively new?
A:
Yes, exactly. This idea that you might want to pool everybody’s data—to share it and make it available to researchers outside of your own institution—was not always what people were thinking about when they first created their research protocols. It’s partly a cultural thing.

I think most people who’ve been part of one of these big, collaborative efforts have seen the benefits, but it’s important to remember that the majority of people in science haven’t done it. For most people in the country and the scientific world, that’s not how it works. People work in their own lab, they maybe have another collaborator, but they don’t share their data with everybody.

Within the Broad, we already think of that—we already work in big teams and across disciplines. But we don’t want to work just within the Broad. We want the bigger team of the entire scientific community sharing data to solve problems in health and disease. Bringing this culture to other people is a challenge and one of our main goals.

Q: What do you feel has been your biggest success?
A:
I’d say it’s been being a part of this big international team of scientists who created TCGA and made all of the data public. And even more gratifying is that thousands of other papers, outside of our efforts, have used our TCGA data to come up with discoveries. We created this resource that the entire scientific community is using. That’s huge.