Burden Testing of Rare Variants Identified through Exome Sequencing via Publicly Available Control Data.
The genetic causes of many Mendelian disorders remain undefined. Factors such as lack of large multiplex families, locus heterogeneity, and incomplete penetrance hamper these efforts for many disorders. Previous work suggests that gene-based burden testing-where the aggregate burden of rare, protein-altering variants in each gene is compared between case and control subjects-might overcome some of these limitations. The increasing availability of large-scale public sequencing databases such as Genome Aggregation Database (gnomAD) can enable burden testing using these databases as controls, obviating the need for additional control sequencing for each study. However, there exist various challenges with using public databases as controls, including lack of individual-level data, differences in ancestry, and differences in sequencing platforms and data processing. To illustrate the approach of using public data as controls, we analyzed whole-exome sequencing data from 393 individuals with idiopathic hypogonadotropic hypogonadism (IHH), a rare disorder with significant locus heterogeneity and incomplete penetrance against control subjects from gnomAD (n = 123,136). We leveraged presumably benign synonymous variants to calibrate our approach. Through iterative analyses, we systematically addressed and overcame various sources of artifact that can arise when using public control data. In particular, we introduce an approach for highly adaptable variant quality filtering that leads to well-calibrated results. Our approach "re-discovered" genes previously implicated in IHH (FGFR1, TACR3, GNRHR). Furthermore, we identified a significant burden in TYRO3, a gene implicated in hypogonadotropic hypogonadism in mice. Finally, we developed a user-friendly software package TRAPD (Test Rare vAriants with Public Data) for performing gene-based burden testing against public databases.
|Year of Publication
Am J Hum Genet
2018 10 04
|PubMed Central ID
P50 HD028138 / HD / NICHD NIH HHS / United States
R01 DK075787 / DK / NIDDK NIH HHS / United States
R01 HD090071 / HD / NICHD NIH HHS / United States
UL1 TR001102 / TR / NCATS NIH HHS / United States