In a Q&A, the researchers share their hopes and excitement about the Impact of Genomic Variation on Function program.
Broad scientists join new consortium studying the function of genetic variation
Early on in the genomics era, scientists discovered that only one percent of the genome codes for proteins, which carry out the vast array of activities in cells and tissues. Over the last 15 years, scientists have discovered that the other 99 percent, once considered “junk,” actually encodes vital information for normal cellular function. In the spaces between genes lie non-coding functional elements, which regulate gene expression, acting as knobs that control which genes are turned on, when, and by how much.
Projects like the Encyclopedia of DNA Elements (ENCODE), the Human Cell Atlas, GTEx, and others have catalogued millions of noncoding elements and revealed the complex and unique regulatory landscape of human tissues. Genome-wide association studies have shown that regulatory elements are also important in human disease — the majority of genetic variants associated with specific traits or diseases lie in non-coding regions of the genome.
Yet, deciphering the genome is challenging, and direct causal relationships between genetic variants, their effects on regulatory elements, and their overall function remain elusive.
A newly launched research effort, called the Impact of Genomic Variation on Function (IGVF), aims to delve more deeply into the effect of genetic variation on genome function. Funded by the National Human Genome Research Institute (NHGRI), this consortium includes researchers from the Broad Institute of MIT and Harvard and 25 other awardees from 30 other institutions. Broad scientists will map regulatory elements in a variety of cell types and assess where and when regulatory elements and genes are expressed. They aim to do this on a single-cell level, allowing for a richer, more in-depth understanding of how genetic variation in a diverse set of tissues and cell types affects genome function in human health and disease. All IGVF researchers will share their data openly.
Broad’s IGVF project aims to study a wide range of human tissue types and diseases and prioritizes donors of diverse ancestries, particularly those underrepresented in genomic studies. For example, the Broad team is proposing to study blood samples from people with lupus. “Lupus is a chronic autoimmune disease that affects more women than men and is more common among women of color. Studying genetic regulation related to lupus is not only important for our understanding of the disease, but will also help contribute to the diversity of human genomic information,” said Liz Gaskell, the associate director of the Gene Regulation Observatory and Epigenomics Program, who, together with project manager Nina Farrell, is managing the Broad’s IGVF efforts.
At Broad, the IGVF project is led by Bradley Bernstein, an institute member at Broad, director of the Epigenomics Program and the Gene Regulation Observatory, and chair of the Department of Cancer Biology at Dana-Farber Cancer Institute, and Broad associate member Jason Buenrostro, an assistant professor at Harvard University in the Department of Stem Cell and Regenerative Biology. Charles Epstein, an associate director of the Broad’s Epigenomics Program, will lead the data collection effort. The consortium also includes associate member Luca Pinello, an associate professor at Massachusetts General Hospital and Harvard Medical School, and Jesse Engreitz, an assistant professor at Stanford University who launched a Broad initiative to study the function of genetic variants as a research fellow at Broad.
"I am excited about the opportunities that the IGVF consortium will provide to understand the function of variants and regulatory elements at an unprecedented scale and eager to contribute computational methods that will empower this community," said Pinello.
“Our team studies variants that affect risk for heart diseases. Large-scale human genetic studies have discovered thousands of variants that could yield new approaches to treat disease, but it’s challenging to investigate these variants one at a time,” said Engreitz, “Using high-throughput tools we developed as part of the Variant-to-Function Initiative when I was a fellow at the Broad, our IGVF Functional Characterization Center at Stanford will connect variants to the cell types, genes, and pathways they control. I’m thrilled to continue this work with the Broad community.”
We spoke with Bernstein, Buenrostro, and Epstein about IGVF, what they hope to achieve, and how IGVF can contribute to the medical and research communities.
What are the key questions you hope to answer with IGVF?
BB: The overarching goal is to interpret the human DNA sequence, to understand the baseline sequence of the human genome, and then to understand the impact of variants between individuals or in disease. To do this, we need to understand regulatory elements.
Regulatory elements are switches that control gene expression and these switches are really intricate and go on in specific tissues, specialized cells, and disease contexts.
We’re going to map regulatory elements and ask when and where they come on at single-cell resolution. This will give us the most fine-grained map of their activity across different tissues, across different cell types within a tissue, or even in transient or rare cell states that we’ve yet to observe but hope to capture in this project. And from this dense and dynamic information, we will infer function. We’ll seek to understand what controls each regulatory element, what genes they’re influencing, and their significance in health and disease.
JB: We all know that every one of our cells is different in our bodies. But because each one of the countless numbers of cell types expresses different genes, that also means that each cell type is uniquely vulnerable to a genetic mutation, which can cause disease.
What we propose to do is to use single-cell technologies to measure the gene regulatory elements of our genome so that we can overlap these elements with our understanding of human genetic variation.
For us to do this, we have to first understand what genes each cell type is using and also how they’re regulated. What we’ve learned in genetics is that most of the human genetic variation actually happens at these non-coding regulatory elements, which change the expression of genes. In order for us to understand how a genetic mutation will affect the immune cell versus a skin cell, we first have to understand their respective regulatory landscapes. That's been explored before, but not at the scale that we are proposing.
How will this build on what’s already known about regulatory elements?
JB: We are now enabled by new technologies that didn't exist when we were trying to do this as a community before. With the advent of single-cell tools, we can finally look at each individual cell within a tissue and understand its regulatory profile. That wasn't possible even just a few years ago. We're hoping that this new effort will give us a full body map of all the human regulatory elements within every single cell.
CE: We're going to go from working with tissues to working with the individual cells that make up a tissue. We’re mapping both chromosomal structure and gene expression simultaneously at the single-cell level in these tissues, which will give us the ability to make conclusions about causality — whether a regulatory region or gene is the source of disease — directly. Using both of these mapping strategies on single cells at the same time enormously empowers our ability to form hypotheses about gene regulatory networks.
What are you most excited about achieving with IGVF?
JB: What we're most excited to do is enable new opportunities in genetic medicine. This is the goal of the consortium and we're so excited to be part of it. We're excited to work with others to build new statistical tools, integrate these data types, and build models that predict the function of genetic variants and just to generally enable the community to do genetic research.
BB: What I’m excited about is the ultimate impact a dataset like this will have on how the community does science. Since all the data will be made public, scientists will have boundless information about the genome and its regulation in different cell types. For any sequence element or nucleotide, they’ll be able to look up the predicted function, and the cellular contexts in which it manifests. I'm not sure we'll fully get there in five years, but that's the long-term trajectory. I think it will see massive use by the Broad community and by basic scientists, medical geneticists, and disease researchers across the world. It's a big vision.
CE: I hope it will contribute to the development of pharmaceutical approaches to improve human health. Researchers are doing genetic analyses of diseases like diabetes, Alzheimer's disease, and cardiovascular disease. While genetics is a very powerful paradigm for studying causality in biology, it’s not a holistic method of tackling these sorts of problems. To realize the full potential of genetics and genomics for characterizing human biology, we need to keep understanding functional regulation and epigenetics. What excites and motivates me is a sense of optimism about the improvement of the health of the population through our discoveries of biological mechanisms and how those can ultimately lead to the development of new medicines.