Hayden Metsky helps keep viruses under surveillance

In a #WhyIScience Q&A, an MIT graduate student explains how he’s bringing computer science to bear on global health

Hayden Metsky
Hayden Metsky

As the human population grows, and technological advances increase our mobility and put us in closer contact, the public health community’s concern about the risk of viral outbreaks and pandemics mounts. Understanding how microbes, such as viruses and bacteria, operate and evolve is a pressing need, and improving methods for preventing, diagnosing, tracking, and treating infectious disease is a global priority.

Enter researchers like Hayden Metsky, a graduate student in the lab of Broad institute member Pardis Sabeti. He develops and applies computational tools to help analyze and track microbial genomes in order to better understand and fight infectious disease. In 2016, he helped lead, with colleagues from the Sabeti Lab and partners from across the Americas, an effort to sequence Zika virus genomes when an epidemic of the then largely unstudied disease spurred a multi-national response. He has since built on lessons learned from that work and is currently developing new methods to address challenges in microbial sequencing and surveillance.

In a #WhyIScience Q&A, Metsky describes the path that led him to computational biology, and to the Sabeti lab and his current line of research:

Q: What do you do in the Sabeti Lab?
I work on developing and applying computational tools to solve problems in infectious disease genomics. For example, I worked alongside incredible scientists from the Sabeti Lab and throughout the Americas to sequence 110 genomes of Zika virus from 10 countries to reconstruct the evolution and spread of the virus during the 2015–2016 epidemic. What we found was that Zika was circulating undetected in multiple regions for months before cases were confirmed. But our findings also made completely clear to us that pathogens are often extremely difficult to surveil for, let alone sequence, and we need better tools and systems for this. To address this problem, I developed CATCH, a method to improve detection and sequencing of diverse microbial species. Together with Katie Siddle, a postdoc in the lab, we showed in a recent preprint that CATCH allowed us to extract much more information on viral genomes in patient samples than we could otherwise recover. Katie and I think that, going forward, this will be an important tool for effective diagnostics and surveillance.

Q: How did you first become interested in computational biology?
I became interested in computer science in middle school thanks in large part to encouragement from my uncle. At the same time, I was always interested in medicine. For an eighth grade science fair project I wrote a computer program called “Computer Doctor” that diagnosed a person’s illness from symptoms they entered, and then I tested it using real data from a pediatrician. Although it was simplistic, it somehow worked on that data.

Later, as an undergraduate, I started to understand all of the applications that computer science could have to biological questions. I started research in the field, first in transcriptomics and then in epigenomics. I enjoyed it a lot, but ended up wanting something that felt closer to medicine and public health. That led me to infectious disease. There are so many amazing opportunities to develop new computational approaches and apply existing ones in new ways that can have a major impact in this field. In some ways I’ve come full circle to Computer Doctor, but 13 years ago I had no idea about and could have never have imagined the kinds of tools we’d have now and the work I’d be doing.

Q: What trend in your field is most intriguing to you right now?
Researchers are using genomics to enable diagnostics and surveillance of infectious disease in a comprehensive way, which is extremely exciting. For example, there’s increasing interest in applying metagenomics in clinical settings, and in making technological advances to enable that. The idea is to sequence all the genetic material directly from a patient sample to identify pathogens that are causing infection, while also mining more detailed information like evidence for antimicrobial resistance. Metagenomic sequencing provides this information without having to know beforehand what is in the sample, and also supports discovering previously unknown pathogens. Routine use of it in clinical applications can be transformative, but there are roadblocks to this, including high cost, lack of portability, and low sensitivity in detecting microbes among everything else that gets sequenced along with them. What’s most exciting to me is that the scientific community is developing new technologies, like CATCH, to help overcome these challenges, and there is a lot of interesting work in the Sabeti Lab and elsewhere along these lines.

Members of the Sabeti lab’s Zika team.Members of the Sabeti lab’s Zika team at work, including (l-r) Steve Schaffner, Bronwyn MacInnis, Shirlee Wohl, and Hayden Metsky.

Q: What are the biggest challenges in your research field right now?
There are two pressing challenges on my mind lately. One is that many microbes we might be interested in detecting and studying are present at low titer, meaning there’s little of their genetic material in samples. Another is that microbes, especially viruses, have an incredible amount of diversity, and our knowledge of that diversity is growing but incomplete: a pathogen infecting someone might have a genome that we’re not actively looking for, or it might even be different from anything we’ve sequenced. Together, these challenges suggest that we’re often not only trying to find a needle in a haystack, but we may not always know what the needle looks like or whether there even is a needle at all.

In a separate, cultural realm, the field needs more willingness and better systems for making methods, tools, data, and findings freely available—and doing so quickly, especially when they involve public health. I’m far from the first person to mention this. I still see instances where we need to do a better job, but overall it’s moving in the right direction.

Q: What can computer science and machine learning bring to biomedical research that would otherwise be lacking?
Biomedical research and healthcare are seeing an explosion of data. It’s not just human genomes, but single cell profiles, microbiomes, continuously measured phenotypes, and more. This opens incredible opportunities for incorporating computer science. The field provides techniques, such as algorithms and systems, for working with and making sense of this data, which is often vast and noisy. Beyond data analytics, computer science offers rigorous ways of looking at problems. Thinking with a computational mindset can often help to better formulate the kinds of questions we can ask in biology and healthcare, determine what data we would need, and develop new approaches to address those questions. The synergy between all these fields is transforming how we look at problems in biology, medicine, and public health, and opens many doors for what we can do.

Q: What is the best advice you have for mentees or people new to your field?
It’s tremendously important to chase the problems that excite and motivate you, especially ones where you think your background or academic interests can help you make an impact. It then becomes natural and fun to come up with, test, and play around with your own ideas. It’s also critical to talk constantly with your peers about your ideas and research developments. I’ve become a much better researcher largely because of all my interactions with peers in the Sabeti Lab, as well as those in past labs.