In depth: Polygenic risk scoring
Tests based on millions of common DNA variants reveal hidden disease risk; researchers work to bring them to the clinic
Cardiologist Amit V. Khera knows that the tried-and-true methods he uses to assess his patients’ health don’t tell the whole story. Someone who smokes, has high blood pressure, high cholesterol, and is diabetic is clearly at high risk for a heart attack and may leave Khera’s office with a prescription to take a cholesterol-lowering statin drug and start exercising. On the other hand, he may tell a patient with impeccable blood work who runs marathons to “keep up the good work.”
But it’s these “low-risk” patients that have Khera worried. A small number of them, he knows, will have a life-threatening cardiac event despite their stellar clinical charts. And he doesn’t know which ones they are.
“Many young people who have heart attacks would not have been predicted to be at high risk using traditional clinical factors,” explained Khera, a physician-scientist who studies the genetic roots of heart disease in the lab of Sekar Kathiresan at the Broad Institute and Massachusetts General Hospital (MGH). For decades, doctors have been telling these patients that it must be something “in their genes” that led to this heart attack. Khera has confirmed that this is often the case by measuring unseen risk factors in their DNA that, if tested, would have put them in the “high-risk” category.
Khera, Kathiresan, and other scientists are taking advantage of large data sets mapping the common genetic variants that are spread throughout the genome, using them as clues for predicting disease. Recent advances have made it possible to scan a person’s DNA for these variants and calculate a previously hidden source of disease risk, resulting in what they term a “polygenic risk score.”
As they develop this method for clinical use, physicians could soon use a simple DNA test to see if their patients harbor a hidden, genetic risk for common illnesses like heart disease, type 2 diabetes, or breast cancer — and intervene before disease sets in. A high polygenic risk score doesn’t guarantee a patient will develop the disease, but signals that they are much more likely to become ill and may benefit from preventive measures.
Several hurdles remain, however, before polygenic risk screening makes it to the clinic. Scientists at the Broad Institute and elsewhere are working to hone risk score algorithms, expand the databases that feed the algorithms, and plan how to best integrate these scores into clinical care.
A health forecast, built on more than a decade of data
To date, most genetic testing focuses on single, rare mutations that can lead to conditions like Huntington’s disease or certain breast cancers. But now, scientists can capture in a single measurement how millions of sites across the genome can impact one patient’s health.
The data enabling this feat comes from experiments known as genome-wide association studies, or GWAS, large-scale genetic analyses aimed at uncovering which common DNA differences among people influence different diseases and physical traits. In the early days of these studies, many scientists thought GWAS data wouldn’t be clinically useful to gauge risk since each of these “hits” individually raise the threat of disease only slightly.
“Ten years ago, people said the biggest value of GWAS findings would be in revealing new biology, and less in disease prediction,” said Kathiresan, who is also co-director of the Broad’s Medical and Population Genetics Program, director of the Center for Genomic Medicine at MGH, and professor of medicine at Harvard Medical School. Still, he envisioned a future in which polygenic risk scores could help him stratify his cardiology patients, identifying those at very high or low risk of heart attack based on unseen, inherited factors.
“Each single variant has a modest effect, but we wondered if we could do better at gauging risk by summing them up,” said Kathiresan.
In 2008, Kathiresan and his team created a polygenic risk score for cardiovascular disease based on nine DNA variants. In 2010, they built a score based on 12 variants, followed by one based on 50 variants in 2015.
Around that time, genome-wide studies were generating larger and more rigorous datasets. Along with new computational advances and global efforts to create massive population cohorts, the rich troves of GWAS data at hand inspired an ambitious question: Would it be possible to create an even better predictive score by including all the millions of variants tested in a GWAS, even those that were well below the statistical threshold to be called risk factors?
Risk scores go large-scale
This month, Kathiresan and Khera answered that question with a resounding “yes.” Describing their work in Nature Genetics, the two of them, along with Broad computational biologist Mark Chaffin, developed a so-called “genome-wide polygenic risk score” (GPS) for coronary heart disease based on 6.6 million variants across the human genome. They found a spectrum of risk across the population, with most people at average risk, some at very low risk, and some at very high risk. The eight percent of people with the highest risk scores were at a more than three-fold increased risk for heart disease, a hazardous level similar to that conferred by a rare, single-gene mutation that is found in far fewer people and warrants aggressive treatment to decrease cholesterol levels.
The researchers also applied the score to four other common diseases — breast cancer, inflammatory bowel disease, type 2 diabetes, and atrial fibrillation — and noted remarkably consistent results. They identified between 1.5% and 6.1% of the population that was at more than three-fold increased risk for these four diseases. In the case of coronary heart disease, there are 20 times more people with high polygenic risk hiding in their genomes than those carrying a dangerous single-gene mutation. Unlike many of the rare mutations, polygenic risk for heart disease can be high without elevated cholesterol levels, making it an especially insidious threat.
“We always hoped that polygenic risk scores could one day be useful,” said Khera. “Now we’re getting to the point of actually being able to stratify patients by risk in a clinically meaningful way.”
Kathiresan anticipates that polygenic risk scoring for heart attack will be available for the clinic in about a year, and as common in medical care as LDL cholesterol testing in less than a decade. Based on recent findings, he also anticipates that predictive scores for obesity or breast cancer could reach the clinic within ten years.
“The majority of women who have breast cancer at a young age don’t have a mutation in either the BRCA1 or BRCA2 genes,” he said. “Women with a high polygenic risk for breast cancer might benefit from more frequent mammograms starting at an earlier age,” he added, although more work would be needed to prove the efficacy of those interventions in catching early cases of breast cancer, and then to develop a clinical test to do so.
According to Kathiresan, scientists will need to continue improving the scoring algorithms as clinicians determine the best ways to package the data for both doctors and patients. Meanwhile, as with any new promising technology, the researchers fear that fly-by-night companies might peddle unproven and inaccurate risk-scoring services — and costly, ineffective treatments — to people eager to gaze into their genetic crystal balls. For this reason, said Kathiresan, scientifically rigorous scoring methods, along with thoughtful approaches to integrating them into medical care, will be critical to the safe and effective use of predicting risk to improve patients’ lives.*
In search of risk scores that work for all
Another significant hurdle is a problem faced throughout the field of genomics: lack of diversity. As of 2009, 96% of GWAS participants were of European descent; that number fell to 81% in 2016, mostly due to the inclusion of participants of East Asian descent, but there is still a lack of African and Hispanic representation.
An individual’s polygenic risk prediction is most accurate when calculated from GWAS datasets of similar ethnic background, so at present the scores work best for people of European ancestry. In the case of heart attack, the scores may identify some high-risk people of East Asian, South Asian, Hispanic, or African-American ancestries, but risk scores are less clinically useful for those populations as the scores are far less accurate due to lack of diversity in GWAS cohorts.
According to postdoctoral researcher Alicia Martin, “Lack of diversity is the primary scientific and ethical limitation we currently face in translating these polygenic risk scores into the clinical space.”
Because of this, she and her colleagues in the lab of Broad institute member Mark Daly, also co-director of the Broad’s Medical and Population Genetics Program, are working to improve the situation by conducting GWAS studies with more globally diverse groups. According to her, this will lead to more accurate polygenic risk prediction for all populations.
Martin is also helping to lead a collaborative working group with members of Daly’s lab and the lab of Broad institute member Ben Neale, also an associate professor at Harvard Medical School and MGH and director of population genetics for the Broad’s Stanley Center for Psychiatric Research. The group aims to make statistical methods for genetic risk prediction more generalizable across diverse populations.
“People of European background already have better health care and outcomes overall than other groups,” she said. “There should be genuine efforts to be equitable in genomic research and use it to improve health for all.”
Polygenic risk sheds light on the nature of illness
Polygenic risk analysis can also help researchers better understand the basic mechanisms of certain diseases. Institute member Jose Florez, who is an endocrinologist at MGH, professor at Harvard Medical School, and co-director of the Broad’s Metabolism Program, leads the Broad’s efforts to study polygenic risk in type 2 diabetes. Recently, scientists in the Florez lab found that examining common variants through polygenic risk analysis revealed subgroups of type 2 diabetes that were informed by physiologic variables such as insulin levels, but driven by genetic differences. Classifying patients based on DNA variants, which are present from birth and don’t change with developmental stage or disease severity, could offer a more precise and individualized way to manage the disease.
“Our vision is to use this information to stratify preventive or therapeutic interventions in a more rational manner, so they can one day be deployed to the group most likely to benefit,” he said. For now, type 2 diabetes can be predicted with standard clinical markers alone, although polygenic risk scores could be particularly useful in certain subpopulations.
Even though genetic risk scores aren’t yet strong enough to be clinically useful for psychiatric diseases, researchers can use them to better understand certain mental disorders.
“In our work, we also use risk scores as instruments to better understand how different types of genetic risk for conditions like autism are working together or creating risk in different ways,” said Elise Robinson, an associated scientist at Broad and assistant professor of epidemiology at the Harvard T.H. Chan School of Public Health. Last year, Robinson and colleagues discovered how rare mutations can exacerbate an individual’s polygenic risk for autism.
Polygenic risk scores can also help reduce stigma surrounding mental illness. Compared to rare monogenic mutations that are either present or absent, risk scores are distributed like a bell-curve in the population, so everyone ends up someplace along the spectrum of risk. Every individual, it turns out, has some degree of polygenic risk for many common disorders, including psychiatric conditions, based on their inherited DNA.
DNA is not destiny
Florez and fellow scientists in the Diabetes Prevention Program have found that lifestyle interventions aimed at improving diet and exercise habits are effective in preventing type 2 diabetes, regardless of the genetic burden a person carries. “This raises great hope that patients can overcome the genetic risk conferred by inheritance through easily adoptable preventive measures,” he said. “The public should be aware that genetic findings are not deterministic.”
Genetic information, in other words, is one more risk factor to be considered alongside others, and in some cases genetic risk can be offset by other interventions.
In their research, Khera and Kathiresan have shown that, even for those in the high-risk category, genetic risk for coronary heart disease can be overcome by interventions like lifestyle changes or medication,. “High genetic risk doesn’t mean you’re doomed,” said Khera. “You actually have a lot of power to modify the risk encoded in your DNA.”
*Khera and Kathiresan are listed as co-inventors on a patent application for the use of genetic risk scores to determine risk and guide therapy.
In Broad Video
Sekar Kathiresan: Why Are Polygenic Risk Scores Important?
Sekar Kathiresan: What Is Polygenic Risk?
Amit Khera explains how polygenic risk scores are developed
Sekar Kathiresan discusses polygenic risk and cardiac health
Amit Khera on the future of polygenic risk in the clinic