You are here

Nat Methods DOI:10.1038/s41592-018-0054-7

A synthetic-diploid benchmark for accurate variant-calling evaluation.

Publication TypeJournal Article
Year of Publication2018
AuthorsLi, H, Bloom, JM, Farjoun, Y, Fleharty, M, Gauthier, L, Neale, B, Macarthur, D
JournalNat Methods
Date Published2018 08
KeywordsAlgorithms, Benchmarking, Cell Line, Tumor, Databases, Genetic, Diploidy, Female, Genetic Variation, Genome, Human, Homozygote, Humans, Hydatidiform Mole, Pregnancy, Synthetic Biology, Uterine Neoplasms, Whole Genome Sequencing

Existing benchmark datasets for use in evaluating variant-calling accuracy are constructed from a consensus of known short-variant callers, and they are thus biased toward easy regions that are accessible by these algorithms. We derived a new benchmark dataset from the de novo PacBio assemblies of two fully homozygous human cell lines, which provides a relatively more accurate and less biased estimate of small-variant-calling error rates in a realistic context.


Alternate JournalNat Methods
PubMed ID30013044
PubMed Central IDPMC6341484
Grant ListR01 HG010040 / HG / NHGRI NIH HHS / United States
U01 HG009088 / HG / NHGRI NIH HHS / United States
U54 DK105566 / DK / NIDDK NIH HHS / United States