Study highlights need to increase diversity within genetic data sets
Diversifying population-level genetic data beyond Europeans will expand the power of polygenic scores.
By Namrata Sengupta
Credit: Susanna M. Hamilton, Broad Communications
Polygenic scores can predict a person’s risk for conditions like coronary artery disease, breast cancer, and type 2 diabetes (T2D) with great accuracy, even in patients who lack common warning signs. This new genome analysis tool holds promise for physicians, who may be able to intervene earlier to help prevent common disease for at-risk individuals.
According to a new study, however, polygenic scores developed by studying Europeans do a better job at predicting disease risk for people of European ancestry than for those of other ancestries.
Researchers from the Broad Institute of MIT and Harvard and Massachusetts General Hospital (MGH) led a team that used large-scale genetic data from UK Biobank to develop prediction scores for height, body mass index, T2D, and certain other traits and diseases.
The researchers found that polygenic scores, calculated based on data from UK Biobank, had a 4.5 times higher prediction accuracy for people of European ancestry than those of African ancestry, and two times higher accuracy than those of East Asian ancestry.
“From a clinical context, this means that current polygenic scores are significantly better in predicting the risk of common diseases for people of European ancestry than those of African ancestry,” said Alicia Martin, the lead author of the study and an affiliate of the Program in Medical and Population Genetics (MPG) and the Stanley Center for Psychiatric Research at the Broad Institute.
Martin, who is now an instructor in investigation at MGH, started this work while she was a postdoctoral researcher in the lab of Mark Daly, institute member and MPG co-director.
With advances in genome sequencing technology, studies in people of European ancestry have grown rapidly in the last few years, while the proportion of non-Europeans in these genomic studies have stagnated since 2014, the authors report. As of 2016, 80 percent of participants in genetic studies are of European descent, even though Europeans constitute only 16 percent of the world population.
UK Biobank is one of the largest publicly available genetic data sets. It contains information for half a million people, about 94 percent of whom are of European ancestry. Fewer than 10 percent are of African, South Asian, East Asian, and Hispanic or Latino ancestry.
However, Martin and her team also developed separate polygenic scores using data from the BioBank Japan Project, an East Asian data set, and found that scores calculated from this data set were almost 50 percent more accurate in predicting disease risk for East Asians than scores based on UK Biobank data.
“This further confirms that risk predictors are more precise if they are drawn from genetic data derived from a similar ancestry,” Martin said. “It is crucial that researchers should recruit more minority populations in future genetic studies and also make data from such studies accessible and open. Failure to do this will lead to further inequities in our healthcare system.”
In recent years, Sekar Kathiresan, an institute member and director of the Cardiovascular Disease Initiative at the Broad Institute, and his colleagues have advanced research in polygenic scoring, increasing their predictive power tremendously, and they are working to implement clinically meaningful risk predictors.
“Though health disparities are currently related to social determinants of health rather than genetic testing, it will be important for the biomedical community to ensure that all ethnic groups have access to genetic risk prediction of comparable quality,” said Kathiresan, who is also director of the Center for Genomic Medicine at MGH and a professor of medicine at Harvard Medical School. “This will require undertaking or expanding large genomic studies in non-European ethnic groups.”
This study was funded by the National Institutes of Health.