You are here

Nature DOI:10.1038/s41586-020-2371-0

Mapping and characterization of structural variation in 17,795 human genomes.

Publication TypeJournal Article
Year of Publication2020
AuthorsAbel, HJ, Larson, DE, Regier, AA, Chiang, C, Das, I, Kanchi, KL, Layer, RM, Neale, BM, Salerno, WJ, Reeves, C, Buyske, S, Matise, TC, Muzny, DM, Zody, MC, Lander, ES, Dutcher, SK, Stitziel, NO, Hall, IM
Corporate AuthorsNHGRI Centers for Common Disease Genomics
Date Published2020 May 27

A key goal of whole-genome sequencing (WGS) for human genetics studies is to interrogate all forms of variation, including single nucleotide variants (SNV), small insertion/deletion (indel) variants and structural variants (SV). However, tools and resources for the study of SV have lagged behind those for smaller variants. Here, we used a scalable pipeline to map and characterize SV in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest WGS-based SV resource to date. On average, individuals carry 2.9 rare SVs that alter coding regions, affecting the dosage or structure of 4.2 genes and accounting for 4.0-11.2% of rare high-impact coding alleles. Based on a computational model, we estimate that SVs account for 17.2% of rare alleles genome-wide with predicted deleterious effects equivalent to loss-of-function coding alleles; approximately 90% of such SVs are non-coding deletions (mean 19.1 per genome). We report 158,991 ultra-rare SVs and show that around 2% of individuals carry ultra-rare megabase-scale SVs, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and non-coding elements, revealing trends related to element class and conservation. This work will help guide SV analysis and interpretation in the era of WGS.


Alternate JournalNature
PubMed ID32460305
Additional Materials