You are here

PLoS Comput Biol DOI:10.1371/journal.pcbi.1002604

Efficiency and power as a function of sequence coverage, SNP array density, and imputation.

Publication TypeJournal Article
Year of Publication2012
AuthorsFlannick, J, Korn, JM, Fontanillas, P, Grant, GB, Banks, E, DePristo, MA, Altshuler, D
JournalPLoS Comput Biol
Date Published2012
KeywordsAlgorithms, Cluster Analysis, Databases, Genetic, European Continental Ancestry Group, Genome-Wide Association Study, Genomics, Genotype, Humans, Oligonucleotide Array Sequence Analysis, Polymorphism, Single Nucleotide, Sensitivity and Specificity, Sequence Analysis, DNA

High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, array intensities, and imputation. In European samples, we find similar sensitivity (89%) and specificity (99.6%) from imputation with either 1× sequencing or 1 M SNP arrays. Sensitivity is increased, particularly for low-frequency polymorphisms (MAF


Alternate JournalPLoS Comput. Biol.
PubMed ID22807667
PubMed Central IDPMC3395607
Grant ListT32 GM007748 / GM / NIGMS NIH HHS / United States
U01 HG005208 / HG / NHGRI NIH HHS / United States
5-T32-GM007748-33 / GM / NIGMS NIH HHS / United States
U01HG005208 / HG / NHGRI NIH HHS / United States