Experimental design for maximizing cell-type discovery in single-cell data

Engelhardt Group, Depts. of Computer Science, Quantitative and Computational Biology, Princeton University

Bandit algorithms are often the tool of choice for recommendation engines, and have recently seen applications in the context of medical health care data. Here, inspired by bandit ideas, we show a novel application to iterative experimental design in multi-tissue single-cell RNA-seq (scRNA-seq) data. We present two algorithms, a Good-Toulmin like estimator via Thompson sampling (joint work with Karen Feng and Barbara Engelhardt) and an extension involving a Pitman-Yor prior (joint work with Federico Ferrari and Stefano Favaro). Given a budget and modeling cell type information across tissues, they both estimate how many cells are required for sampling from each tissue with the goal of maximizing cell type discovery across samples from multiple iterations. In both real and simulated data, we demonstrate the advantages these algorithms provide in data collection planning when compared to a random strategy in the absence of experimental design.

MIA Talks Search