Largest genome-wide association study ever uncovers nearly all genetic variants linked to height

By analyzing data from nearly 5.4 million people, Broad researchers have identified more than 12,000 genetic variants that influence height.

Colette A. Zylstra, Broad Communications
Credit: Colette A. Zylstra, Broad Communications

Scientists have long known that height is mostly hereditary, but even the geneticists who set out to study height two decades ago weren’t certain they’d ever be able to find the common genetic factors influencing this trait. 

Now, by studying the DNA of more than 5 million people, a team led by those same geneticists has done what they thought years ago would be impossible. In the largest study of its kind to date, researchers from the Broad Institute of MIT and Harvard and the GIANT consortium have found more than 12,000 genetic variants that influence height. These variants explain 10 to 40 percent of all variation in height depending on a person’s ancestry, and cluster around parts of the genome involved in skeletal growth. The team said this study, because of its unprecedented size, has uncovered the vast majority of the genetic variants linked to height, and is also the capstone of a 20-year long effort.  

“We feel that this is really a milestone,” said Joel Hirschhorn, senior author on the study, an institute member at the Broad, and also a professor of genetics and Concordia Professor of Pediatrics at Boston Children’s Hospital and Harvard Medical School. “We’d been studying height for a while but had only identified a small fraction of common variant heritability. We’re now basically done mapping this heritability to specific genomic regions, and that highlights what increasing sample size can tell us about traits controlled by multiple genes.”

He added that the findings, published in Nature, could one day help physicians identify individuals who aren’t reaching their genetically predicted height and might have a hidden disease or deficiency affecting their growth and health. 

The results also illustrate the power of genome-wide association studies (GWAS) to uncover the biological basis and, in larger studies, the heritability of disease. 

“In 2010 we predicted that we ought to be able to explain 40 percent of individual differences in height, but we had no idea that it would take 5 million people and 12,000 DNA variants, and that it would be achieved so fast,” said Peter Visscher, a co-senior author on the study and a professor and chair of quantitative genetics at the University of Queensland. “GWAS, when combined with very large sample sizes, is an amazingly powerful experimental design.”

Heritable height

In 2000, as a pediatric endocrinology fellow at Boston Children’s Hospital, Hirschhorn saw many children who had been referred by their pediatricians for unusually short stature. He’d often tell parents their child was growing slowly because of the genes the child inherited. Though scientists estimated that genes contributed to 80 percent of variation in height, they didn’t know what those genes were.

“After about the 20th patient, I thought, ‘Hey, I’m working in a place that can figure this out,’” he said. As a postdoctoral researcher at Broad, Hirschhorn decided then to study height, though everyone told him height was too polygenic — there were too many genes involved to be able to find them.

Undeterred, Hirschhorn turned to GWAS — studies in which scientists scan whole genomes in a population to identify associations between genetic variants and traits. Over the next two decades, the GIANT consortium found variants associated with height in increasingly large GWAS. Now, together with 23andMe, the consortium has assembled data from nearly 5.4 million individuals — over seven times more than previous studies. Their analysis revealed 12,111 common single nucleotide polymorphisms (SNPs), or places in the genome where a single letter varies, that were associated with height. 

Together, the SNPs account for 40 percent of all variation in height for individuals of European ancestry, and 10-20 percent of variation for people of non-European ancestry. This difference is due to the composition of the GIANT study cohort, which is mostly of European ancestry. This lack of diversity is a known and common problem in genetic studies. But in the GIANT study, more than one million participants were of East Asian, Hispanic, African, or South Asian ancestry, a number the team says is higher than other GWAS. 

The researchers say that across the different populations, so far they have found that the same regions of the genome influence height. They emphasize, however, that including more people of non-European ancestry will be critical to increasing prediction accuracy and could help identify genetic variants specific to certain groups. "This study covers a wide range of human ancestry and is the largest GWAS to date, but we again found that further involvement of diverse ancestries is necessary," said Yukinori Okada, one of the corresponding authors and a professor at Osaka University Graduate School of Medicine and the University of Tokyo as well as a team leader at RIKEN Center for Integrative Medical Sciences in Japan.

These SNPs could help researchers develop better height-prediction tools for use in clinics. Pediatricians currently predict how tall a child will be based on their family history, but these estimates aren’t perfect; they don’t, for example, predict different heights for a pair of siblings. A prediction based on SNPs could potentially be more accurate. If a physician noticed that a child’s height didn’t match a prediction, that could be a clue to test the child for rare hidden conditions that affect growth such as Celiac disease and hormone deficiencies. 

Genome architecture

As the largest GWAS, the study also provides insight for scientists into how to learn from the genome. Adding participants to a GWAS makes it a more reliable and powerful tool, but scientists didn’t know if there was a point at which a study could be “saturated” — when additional data would not provide any new insight. GIANT researchers found that the SNPs they pinpointed ultimately explained more than 90 percent of SNP-based variation, which indicated a point of saturation. They also discovered that discerning the broad brushstrokes of biological pathways that are relevant to height required fewer samples than finding the precise genomic regions. 

“We’ve been able to address this long-standing question in GWAS research using empirical data rather than just theoretical models, which had been used previously,” said Loic Yengo, first and corresponding author on the study, head of the Statistical Genomics Laboratory of the Institute for Molecular Bioscience at the University of Queensland in Australia.

Geneticists also wondered if, as a GWAS grew in size, the SNPs it revealed would be spread out across more and more of the genome. GIANT’s findings showed instead that SNPs influencing height clustered within regions covering just over 20 percent of the genome. In particular, the SNPs were near genes previously associated with skeletal growth disorders; 25, for instance, clustered near the ACAN gene, which is mutated in patients with short stature and a condition called skeletal dysplasia. Several SNPs also implicate signaling pathways that impact skeletal growth plates — cartilage near the ends of long bones that expands and hardens into solid bone as a child grows. 

The researchers believe this clustering of genetic variants likely applies to other traits and could inform the study of other common conditions, such as high blood pressure or asthma, that are influenced by multiple genes. 

Now that they know which genomic regions influence height, the GIANT team can also begin tracing how individual variants impact height using fine-mapping methods. Rarer and more complex variants likely account for the heritability not explained by SNPs and will also be a target of future studies.

A big leap for GWAS

When Hirschhorn was getting started on height genetics research in the mid-2000s, conducting a GWAS with even just 1,000 participants took years of effort, he said. So to study 5 million was unfathomable at the time. “Even the most optimistic among us didn’t think we’d get this big this fast,” Hirschhorn said.

People doubted the utility of GWAS, too, when these studies failed to yield real predictive power. Over the arc of his career, Hirschhorn has watched that change.

“When it became apparent that GWAS would be possible, I used to make sure in every talk I said, ‘there’s no way we’re going to get enough information that this will add to what we can do clinically to predict adult height,’” he said. “But we succeeded beyond our wildest dreams. So now I get the chance to prove myself wrong.”


Additional reporting was contributed by Tom Ulrich.

This work was supported in part by the National Institutes of Health, Wellcome Trust, the UK Medical Research Council, Cancer Research UK, the Australian Research Council, the Australian National Health and Medical Research Council, the UK National Institute for Health Research Centres, the European Union, the European Regional Development Fund, the Netherlands Heart Foundation, the British Heart Foundation, the US Department of Veterans Affairs, the American Heart Association, the Leducq Foundation, the Netherlands Organization for Scientific Research NWO, the European Research Council, the Swedish Research Council, the Novo Nordisk Foundation, the Academy of Finland, and the German Federal Ministry of Education and Research.

Paper(s) cited

Yengo L, et al. A saturated map of common genetic variants associated with human height from 5.4 million individuals of diverse ancestries. Nature. Online October 12, 2022. DOI:10.1038/s41586-022-05275-y