New method improves accuracy of DNA sequencing 1000-fold to detect rare genetic mutations

CODEC could help scientists find early-stage cancer from blood samples, other disease-causing mutations, and more.

DNA showing a mutation inside cells
Credit: Sonja Vasiljeva, Broad Communications

A team of researchers at the Broad Institute of MIT and Harvard has developed a new approach to next-generation sequencing that detects genetic mutations within single molecules of DNA. The method, called Concatenating Original Duplex for Error Correction (CODEC), makes next-generation sequencing about 1000 times more accurate and opens up the possibility of a range of applications including detecting tiny numbers of cancer mutations in blood samples, monitoring cancer during and after treatment, and identifying mutations underlying rare diseases, all at relatively low cost. The study appears today in Nature Genetics.

“The beauty of this approach is that it's not an overhaul of how sequencing is done,” said Viktor Adalsteinsson, senior author on the study and director of the Gerstner Center for Cancer Diagnostics and leader of the Blood Biopsy Team at the Broad. “It's not something that requires new instrumentation or capital investment — it's a simple set of steps added into existing sample preparation workflows to improve the accuracy of DNA sequencing.”

Jin Bae, a research scientist, and Ruolin Liu, a computational scientist, both in Adalsteinsson’s lab, are co-first authors on the study.

Sequencing challenges

CODEC combines the advantages of two existing methods: next-generation sequencing and third-generation sequencing. 

Next-generation sequencing is a high-throughput process in which the two strands of a DNA double helix are separated and sequenced individually. This process is fast, but can’t tell the difference between mutations in the DNA and errors introduced by the sequencing itself, which reduces its ability to detect rare mutations accurately. A sample preparation method called duplex sequencing, which involves tagging individual strands of DNA, can distinguish between true mutations and errors, but is highly inefficient because it sequences each strand of the double helix independently. 

Third-generation sequencing can pinpoint rare mutations by sequencing DNA without separating the two strands, but can also be inefficient and inaccurate.

To overcome these limitations, CODEC uses a specially designed adapter sequence to link one strand of the double helix with the reverse complement of the second strand. The two new strands are then sequenced together using next-generation sequencing. This allows scientists to distinguish between sequencing-induced errors and mutations and generate highly accurate sequence data at low cost. 

Mutation detection

The researchers used CODEC to look for mutation frequencies in sperm and age-related mutations in blood cells, as well as mutations in single molecules of DNA from tumors and other patient samples. They tested CODEC with next-generation sequencing of either the entire genome or only a targeted panel of genes. They found that CODEC distinguished between real mutations and errors with similar accuracy compared to duplex sequencing while needing less DNA to analyze, leading to improved efficiency. 

Adalsteinsson’s team has filed a patent on the technology, and is working on ways to make CODEC even more efficient. Since describing their method in June 2021 in a preprint, Adalsteinsson says they've been contacted by an array of researchers hoping to use CODEC.

“This technology is enabling us to see things that we could never see before with DNA sequencing, and that’s tremendously exciting,” Adalsteinsson said.


This work was supported by the Gerstner Family Foundation.

Paper cited

Bae JH, Liu R et al. Single duplex DNA sequencing with CODEC detects mutations with high sensitivity. Nature Genetics. Online April 27, 2023. DOI: 10.1038/s41588-023-01376-0.