novoCaller: a Bayesian network approach for de novo variant calling from pedigree and population sequence data.

Bioinformatics
Authors
Keywords
Abstract

MOTIVATION: De novo mutations (i.e. newly occurring mutations) are a pre-dominant cause of sporadic dominant monogenic diseases and play a significant role in the genetics of complex disorders. De novo mutation studies also inform population genetics models and shed light on the biology of DNA replication and repair. Despite the broad interest, there is room for improvement with regard to the accuracy of de novo mutation calling.

RESULTS: We designed novoCaller, a Bayesian variant calling algorithm that uses information from read-level data both in the pedigree and in unrelated samples. The method was extensively tested using large trio-sequencing studies, and it consistently achieved over 97% sensitivity. We applied the algorithm to 48 trio cases of suspected rare Mendelian disorders as part of the Brigham Genomic Medicine gene discovery initiative. Its application resulted in a significant reduction in the resources required for manual inspection and experimental validation of the calls. Three de novo variants were found in known genes associated with rare disorders, leading to rapid genetic diagnosis of the probands. Another 14 variants were found in genes that are likely to explain the phenotype, and could lead to novel disease-gene discovery.

AVAILABILITY AND IMPLEMENTATION: Source code implemented in C++ and Python can be downloaded from https://github.com/bgm-cwg/novoCaller.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Year of Publication
2019
Journal
Bioinformatics
Volume
35
Issue
7
Pages
1174-1180
Date Published
2019 04 01
ISSN
1367-4811
DOI
10.1093/bioinformatics/bty749
PubMed ID
30169785
PubMed Central ID
PMC6449753
Links
Grant list
R01 GM078598 / GM / NIGMS NIH HHS / United States
U01 HG007690 / HG / NHGRI NIH HHS / United States
K99 HG007229 / HG / NHGRI NIH HHS / United States
R00 HG007229 / HG / NHGRI NIH HHS / United States
U01 DE024443 / DE / NIDCR NIH HHS / United States
R01 HG010372 / HG / NHGRI NIH HHS / United States