Massive international study uncovers genes involved in heart disease

Scientists link dozens of new genome sites to coronary artery disease risk and pioneer a powerful method for illuminating the biological roots of common disease.

Zayna Sheikh
Credit: Zayna Sheikh

Over the past 15 years, more than 200 sites in the human genome have been linked to risk for coronary artery disease (CAD), the leading cause of death worldwide. Still, researchers don’t fully understand how those genomic variations alter the function of proteins, cells, or tissues to cause illness — knowledge that could inform the development of new treatments.

In a massive study, scientists with the international CARDIoGRAMplusC4D consortium compiled and analyzed DNA data from more than 1 million people, including over 200,000 with coronary artery disease. The researchers discovered 68 new genome regions, or loci, associated with increased risk for CAD, bringing the total to more than 250. They also developed a sweeping approach that incorporates eight diverse lines of evidence and used it to systematically pinpoint 220 candidate causal genes that underlie the associated loci. They verified the role of one of these putative causal genes through genome-editing and cell-based experiments, showing the power of their method to reveal how specific genes might be involved in development of CAD.

The work, published in Nature Genetics, provides a more complete picture of the genetic roots of CAD, outlines a list of genes and genetic variants for future study, and demonstrates an analytical framework for identifying causal genes that can be used to enhance research on other diseases involving genome-wide association studies (GWAS).

The researchers have shared their findings openly in the Cardiovascular Disease Knowledge Portal, developed by scientists at the Broad.

“This collaborative effort represents a substantial advance in the field of coronary artery disease genetics,” said Krishna Aragam, co-first author on the study who is a scientist in the Cardiovascular Disease Initiative at the Broad Institute of MIT and Harvard and a cardiologist at Massachusetts General Hospital. “We hope that our approach encourages groups engaged in GWAS of other traits and diseases to systematically interrogate genetic loci with several orthogonal lines of evidence, and to make resources widely available for others to query. Such studies don’t end with the publication of gene lists – rather, they pave the way for new mechanistic inquiries.”

“We’ve shown that a systematic and disease-tailored approach can effectively point to the true genetic roots of disease and offer sharper insights into disease mechanism, which will be critical to translating statistical insights into biological meaning, and ultimately finding innovative treatments for dangerous illnesses like coronary artery disease,” said Adam Butterworth, co-senior author of the study and a professor of molecular epidemiology at the University of Cambridge.

Causal clues

With the advent of large biobanks and cohorts over the past few years, the research community has been able to mine ever-larger datasets for genetic associations to disease. In the current study, the researchers wanted to expand the search for genetic links to heart disease, and show that their approach could uncover the functional implications of disease-related loci. “The current era of discovery genetics is not just about discovery, but also about asking what links each discovered genetic locus to the disease in question,” said Aragam.

In the new study, consortium scientists gathered genetic and medical data on 1 million people of predominantly European ancestry from UK Biobank, the CARDIoGRAMplusC4D Consortium, prospective cohorts, hospital biobanks and clinical trials, including nearly 200,000 people with coronary artery disease. They performed a GWAS meta-analysis of the whole dataset, scanning DNA sites across each person’s genome to identify genetic variants that are more likely to be found in those with the disease. They found 241 sites in the genome that were associated with CAD risk, including 30 that had never been linked to the disease.

Most of the new genome sites were linked to very small changes in CAD risk, suggesting that there are few, if any, common genetic variants with significant effects on CAD risk left to be found through studying populations primarily of European ancestry.

To increase their discovery power, the researchers combined their large dataset with data from tens of thousands of individuals of East Asian ancestry from Biobank Japan, including 29,000 with CAD. The combined analysis revealed an additional 38 genome sites linked to CAD risk. “Future GWAS that are more inclusive of ancestrally diverse populations are likely to yield more insights than those that are limited to European ancestry participants,” said Butterworth.

The team wanted to go further and find not just these GWAS “hits," but also link them to the nearby genes that cause CAD when they are disrupted. A variety of methods exist for figuring out which gene near a GWAS hit is likely to have a causal role in disease, so the researchers decided to pioneer an innovative, systematic approach that incorporates evidence from eight of these methods. Some of the methods look for the closest or potentially most disruptive variants, while others look for genes known to be altered in people with the disease. In addition, the study is one of the first to use an approach developed in the lab of Broad researcher Hilary Finucane called the Polygenic Priority Score (PoPS), which ranks genes near a GWAS hit based on their involvement in biological pathways at play in CAD.

The researchers applied their framework to all 279 genome sites associated with CAD to systematically look for causal genes in a consistent way. Those that were prioritized by three or more of the eight measures were deemed highly likely to be the causal genes underlying the GWAS hits. The team verified one of these causal genes, MYO9B, using genome-editing and cell-based experiments, finding that it appears to mediate risk for CAD by regulating vascular cell motility.

Predictive power

To explore the potential clinical use of their findings, the researchers generated a new polygenic risk score that incorporates more than 2 million variants in the genome and predicts the risk of both incident and recurrent CAD. The score was based on data from roughly three times as many individuals as the pre-existing risk score for CAD. While the team’s score better predicted an individual’s risk for new and recurrent CAD, the improvement was surprisingly modest given the large increase in GWAS sample size. This suggests that more ancestral diversity and advances in polygenic scoring methods may be more likely to lead to substantive improvements in polygenic risk score performance than can be achieved through increasingly large, single-ancestry GWAS.

The team hopes that other researchers will use their findings to further explore the functional impacts of the likely causal genes.

“The current study demonstrates the importance of the ‘variant-to-function’ approach to improve our understanding of disease biology,” said Aragam. “We hope that our results will lead others to decipher novel disease mechanisms so that we can find new ways to treat CAD, a condition that continues to affect so many around the world.”

The work involved nearly a hundred researchers from more than 20 countries, with key roles played by Tao Jiang, Anuj Goel, Stavroula Kanoni, Brooke Wolford, Deepak Atri, Rajat M Gupta, Jeanette Erdmann, Nilesh J Samani, Heribert Schunkert, Hugh Watkins, Cristen J Willer, Panos Deloukas, and Sekar Kathiresan. Collaborating institutions include University of Oxford, Queen Mary University of London, University of Michigan, University of Leicester, University of Munich, and University of Lubeck.

The work was funded in part by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services, National Human Genome Research Institute, American Heart Association, the UK Medical Research Council, Health Data Research UK, the British Heart Foundation, the UK National Institute for Health Research, National Institute of Health (and Care) Research (UK), and Deutsches Zentrum für Herz-Kreislauf-Forschung (DZHK).