You are here

MIA Talks

CellBender droplet-time-machine: Reversing mixed-template cDNA amplification artifacts in droplet-based 3' scRNA-seq assays

November 20, 2019
Data Sciences Platform, Broad Institute

In the recent years, high-throughput droplet-based single-cell RNA sequencing (scRNA-seq) methods such as 10x Chromium, Drop-seq, and inDrops, have replaced low-throughput plate-based methods such as Smart-seq2 and CEL-Seq2 in many applications. High-throughput methods are significantly less labor-intensive and more cost-effective, allowing us to map the transcriptional landscape of tens of thousands of cells in a single experiment. In addition to the previously known caveats of high-throughput scRNA-seq methods compared to plate-based methods (e.g. reduced mappability and fewer discovered genes), we show that the increased throughput comes at another surprising and paradoxical cost: deeper sequencing of the library while quantifying the gene expression using existing methods can lead to noisier and less reliable outcomes. We trace this nuisance back to the artifacts in mixed-template cDNA amplification, in particular, to the formation of chimeric molecules. We propose a probabilistic method, called "droplet-time-machine," for removing these artifacts and estimating the true pre-PCR cDNA abundance, and present it as a part of the CellBender suite of tools. We use a variety of existing benchmarking datasets to show that RNA quantification using the proposed method leads to significantly increased robustness to variation in sequencing depth and produces results that are comparable to gold standard low-throughput plate-based assays.