Mapping the effects of genetic variation, one letter at a time

A global project seeks to generate and analyze all possible variants of hundreds of protein-coding genes and learn how they might influence health and disease.

JT Neal, founding member of the AVE Alliance executive committee
Credit: Allison Dougherty, Broad Communications
JT Neal, founding member of the AVE Alliance executive committee

In the two decades since the human genome was first sequenced, scientists have learned much about how the genome is organized and how widely it varies between people around the globe. A better view of how that genetic variation influences traits or disease risk could lead to new ways of diagnosing and treating disease, but getting there will take a lot of data on how those DNA differences alter the workings of proteins and cells.

Through a project known as the Atlas of Variant Effects (AVE), launched in late 2020, researchers around the world plan to systematically study the impact of variation across the genome using large-scale methods. They aim to generate a high-resolution map of all possible gene variants and their effects on protein function and physiology. The map could help improve the diagnosis and treatment of human disease, especially through precision medicine approaches, and support studies of genes, gene products, and their regulation.

To learn more about AVE, we spoke with JT Neal, a senior group leader at the Broad Institute of MIT and Harvard who leads the Variant-to-Function project in the Broad’s Cancer Program. His lab at the Broad focuses on developing technology to study genetic variation, including single-cell technologies and imaging. Neal is a founding member of the executive committee of the AVE Alliance, whose members include nearly 200 scientists and clinicians from more than 10 countries. Committee members are working together to launch the project and guide its collaborators toward their ambitious goals.

What is the Atlas of Variant Effects Alliance?

The AVE Alliance is an international community of researchers, analysts, clinicians, patients, and funders who want to learn more about how genetic diversity determines who gets sick and who stays healthy. We also want to learn how it does so under certain conditions, such as exposure to pathogens or in the development of cancer. The AVE Alliance will help like-minded labs come together, engage with stakeholders and funders, and decide which genes are the highest priority to go after.

The alliance is engaging and coalescing a community of genomics researchers who use new, powerful techniques for creating every possible amino acid change at each position in a protein — known as saturation mutagenesis, or deep mutational scanning. The researchers will then use both laboratory assays and computational modeling to observe the effects of those variants on protein function. We’ll integrate this data into an atlas describing the landscape of these variant effects, which will be shared freely with the research community.

We want to do all this with carefully vetted data. One of the alliance’s main goals is to standardize methods for performing experiments and analyzing and sharing data, so that we can all benefit from work done by community members around the globe.

Is the AVE Alliance focused on studying any specific diseases?

The alliance is disease-agnostic. While we’d love to study all variants in both the protein-coding and non-protein-coding portions of the genome, AVE was initiated by scientists interested in saturation mutagenesis techniques for generating protein-coding variants. For this reason, we will first focus on diseases caused by mutations in the regions of the genome that encode proteins, especially cancer and rare Mendelian diseases that are caused by a single gene defect with large effects on physiology.

What are the goals of AVE?

Our ultimate, most ambitious goal is a nucleotide-resolution map of the human genome, with information on the impact of all possible changes at every letter of DNA. On the way to that goal, we hope to produce a nucleotide-resolution map that includes all protein-coding genes.

We hope to have the maximum impact on human health as quickly as possible, so in the shorter term, say in the next five to ten years, we’ll restrict ourselves to genes already associated with disease.

One of our first tasks is producing and publishing a white paper (currently available as a pre-print manuscript) describing a set of recommendations for alliance researchers. We’ll choose 100 or so priority disease-related genes to be tackled first, and identify ways of benchmarking functional assays developed by different groups. Our hope is that technologies continue to improve so that individual labs can conduct these experiments and contribute to the atlas, not just large institutes.

And we’ll continue expanding our community, bringing in clinical partners such as physicians, healthcare providers, and industry partners. As we build the atlas, develop assays, and accumulate data, we want to have end users who can trust and benefit from what we’ve built.

What are some challenges the AVE Alliance will face?

Generating a library of all possible variants of a protein used to be the really hard part, but we’ve seen huge advances with techniques, such as MITE-seq developed at the Broad Technology Labs, that now enable labs to build these libraries themselves, or even to buy them directly from a company. Now, the hard part is developing assays to read out the function of proteins encoded by those libraries.

Assays that measure if a cell dies or grows more quickly or slowly are easy to run, and are scalable and cheap. But we need lab tests for protein function that are more relevant for human physiology and disease.

Take gout as an example. We know there are genetic variants that can affect purine metabolism, which can increase the risk of developing the disease, and one could imagine building a functional lab assay to find these variants. But it’s less clear how one would design an assay to link these variants to a disease symptom, such as swelling of the joint of your big toe. Connecting genetic variants to organism-level functions is one of the biggest challenges we face. Through the alliance, we’ll tackle questions like this and rely on our collective expertise to come up with scalable solutions.

Once the atlas is built, how can others use it?

A researcher would be able to pull up a gene, coding region, or protein sequence that they’re interested in and see, at every site in this sequence, what the functional consequence of different alterations would be.

The atlas may be useful in future precision medicine applications. It could help us identify a cancer drug that would help cure a patient’s tumor, because we’ll know how the patient’s particular mutations help cancer cells grow and spread. The atlas can help clinicians decide which drug to give to each patient, or which drug to avoid to prevent unnecessary side effects or wasted time.

If we can generate an atlas of incredibly high quality, we may perhaps be able to use it to help predict a patient’s risk for disease based on their genetic alterations, with potential applications in genetic and prenatal counseling. However, the burden of proof for preventative intervention is often very high.

What goals has the AVE Alliance achieved so far, and where is it headed next?

Since our kick-off meeting last summer, the AVE Alliance has assembled a set of parallel workstreams that will be responsible for realizing our goals in key areas by setting standards, providing tools, and disseminating information.

We have created the Outreach, Diversity and Inclusion Committee to facilitate partnerships between the alliance and stakeholder communities. We co-hosted the annual Mutational Scanning Symposium, which was virtual this year, and which will (hopefully) be held in person in Toronto next year. We have also engaged in community-building activities by setting up a series of alliance Slack channels and launching an AVE journal club for trainees to share their work.

Moving forward, we hope to continue to grow our community, as well as to expand our engagement with other large genomics communities such as the Global Alliance for Genomics and Health (GA4GH) and International Common Disease Alliance (ICDA), as well as to identify new funding opportunities to support AVE-aligned projects. Our white paper is currently in revision, and will hopefully be published early next year.