Regev Lab, Lander Lab, Broad Institute; Dept. of Computational and Systems Biology, Massachusetts Institute of Technology

Comprehensive RNA profiling provides an excellent phenotype of cellular responses and tissue states, but can be prohibitively expensive to generate at the massive scale required for studies of regulatory circuits, genetic states or perturbation screens. However, because expression profiles may reflect a limited number of degrees of freedom, a smaller number of measurements might suffice to capture most of the information. Here, we use existing mathematical guarantees to demonstrate that gene expression information can be preserved in a random low dimensional space. We propose that samples can be directly observed in low dimension through a fundamentally new type of measurement that distributes a single readout across many genes. We show by simulation that as few as 100 of these randomly composed measurements are needed to accurately estimate the global similarity between any pair of samples. Furthermore, we show that methods of compressive sensing can be used to recover gene abundances from drastically under-sampled measurements, even in the absence of any prior knowledge of gene-to-gene correlations. Finally, we propose an experimental scheme for such composite measurements. Thus, compressive sensing and composite measurements can become the basis of a massive scale up in the number of samples that can be profiled, opening new opportunities in the study of single cells, complex tissues, perturbation screens and expression-based diagnostics.

MIA Talks Search