You are here

Proc Natl Acad Sci U S A DOI:10.1073/pnas.1419064111

Measuring missing heritability: inferring the contribution of common variants.

Publication TypeJournal Article
Year of Publication2014
AuthorsGolan, D, Lander, ES, Rosset, S
JournalProc Natl Acad Sci U S A
Date Published2014 Dec 09
KeywordsAlleles, Case-Control Studies, Computer Simulation, Gene Frequency, Genetic Association Studies, Genetic Diseases, Inborn, Genetic Variation, Genome-Wide Association Study, Genomics, Genotype, Humans, Models, Genetic, Models, Statistical, Phenotype, Polymorphism, Single Nucleotide, Regression Analysis, Research Design

Genome-wide association studies (GWASs), also called common variant association studies (CVASs), have uncovered thousands of genetic variants associated with hundreds of diseases. However, the variants that reach statistical significance typically explain only a small fraction of the heritability. One explanation for the "missing heritability" is that there are many additional disease-associated common variants whose effects are too small to detect with current sample sizes. It therefore is useful to have methods to quantify the heritability due to common variation, without having to identify all causal variants. Recent studies applied restricted maximum likelihood (REML) estimation to case-control studies for diseases. Here, we show that REML considerably underestimates the fraction of heritability due to common variation in this setting. The degree of underestimation increases with the rarity of disease, the heritability of the disease, and the size of the sample. Instead, we develop a general framework for heritability estimation, called phenotype correlation-genotype correlation (PCGC) regression, which generalizes the well-known Haseman-Elston regression method. We show that PCGC regression yields unbiased estimates. Applying PCGC regression to six diseases, we estimate the proportion of the phenotypic variance due to common variants to range from 25% to 56% and the proportion of heritability due to common variants from 41% to 68% (mean 60%). These results suggest that common variants may explain at least half the heritability for many diseases. PCGC regression also is readily applicable to other settings, including analyzing extreme-phenotype studies and adjusting for covariates such as sex, age, and population structure.


Alternate JournalProc. Natl. Acad. Sci. U.S.A.
PubMed ID25422463
PubMed Central IDPMC4267399
Grant ListU54 HG003067 / HG / NHGRI NIH HHS / United States
NIH HG003067 / HG / NHGRI NIH HHS / United States