Cumulus is a cloud-based data analysis framework for large-scale single cell and single nucleus RNA-seq.

Cumulus contains three modules:

(1) a platform to process sequence data and generate gene-count matrices. View Cumulus on GitHub and read the full documentation.

(2) Pegasus, an analysis package that supports common scRNA-seq analysis tasks, including quality filters, batch correction, dimension reduction (tSNE, UMAP, etc.), and differential expression analysis. View Pegasus on GitHub and see the complete documentation.

(3) Cirrocumulus, an interactive visualization application. View Cirrocumulus on GitHub.

Citation: Li B, Gould J, Yang Y, Sarkizova S, et al. (2020). Cumulus provides cloud-based data analysis for large-scale single-cell and single-nucleus RNA-seq. Nature Methods 17: 793–798; doi:10.1038/s41592-020-0905-x.



​​Tangram is a python package to align single cell RNA-seq data to spatial data from the same region. The method is compatible with any sc/snRNA-seq protocol and spatial method, provided that the datasets were generated from the same tissue/anatomical region and share a subset of common genes.

Tangram is available on GitHub; a tutorial can be found here.

Citation:  Biancalani T, Scalia G, Buffoni L, Avasthi R, et al. (2021). Deep learning and alignment of spatially-resolved single-cell transcriptomes with Tangram. Nature Methods 18: 1352–1362; doi:10.1038/s41592-021-01264-7.



Our pipeline for data-driven quality control for scientific discovery in single-cell transcriptomics is available on GitHub. 

Subramanian A, Alperovich M, Yang Y, Li B (2021). Biology-inspired data-driven quality control for scientific discovery in single-cell transcriptomics. bioRxiv; doi:10.1101/2021.10.27.466176.


Power analysis for spatial omics

Our framework to generate in silico tissues to perform a spatial power analysis is available on GitHub. 

Citation: Baker EAG, Schapiro D, Dumitrascu B, Vickovic S, et al. (2022). Power analysis for spatial omics. bioRxiv; doi:10.1101/2022.01.26.477748



ECLIPSER combines expression and alternative splicing QTL gene mapping and single-cell expression data to identify causal cell types and genes for complex traits. 

ECLIPSER is available on GitHub

Citation: Rouhana JM, Wang J, Eraslan G, Anand S, et al. (2021). ECLIPSER: identifying causal cell types and genes for complex traits through single cell enrichment of e/sQTL-mapped genes in GWAS loci. bioRxiv; doi:10.1101/2021.11.24.469720.



ScPhere is a dimensionality reduction tool for scRNA-seq data that embeds cells into low-dimensional hyperspherical or hyperbolic spaces. ScPhere resolves cell crowding, corrects multilevel batch factors, and facilitates interactive visualization for exploratory data analysis.

scPhere is available on GitHub.

Citation: Ding J, Regev A (2021). Deep generative model embedding of single-cell RNA-Seq profiles on hyperspheres and hyperbolic spaces. Nature Communications 12: 2554; doi:10.1038/s41467-021-22851-4.



DIALOGUE is an R package to identify multi-cellular programs—sets of coregulated genes across different cell types—from scRNA-seq data.

DIALOGUE is available on GitHub; a tutorial can be found here

Citation: Jerby-Arnon L, Regev A (2022). DIALOGUE maps multicellular programs in tissue from single-cell or spatial transcriptomics data. Nature Biotechnology; doi:10.1038/s41587-022-01288-0.



MAUDE is an R package to quantify the impact of guide RNAs on the expression of a target gene.

MAUDE is available on GitHub.

Citation: Boer CG de, Ray JP, Hacohen N, Regev A (2020). MAUDE: inferring expression changes in sorting-based CRISPR screens. Genome Biology 21: 134; doi:10.1186/s13059-020-02046-8.



single-cell Scalable Visualization and Analytics

scSVA is an R package for interactive visualization and exploratory analysis for single cell omics datasets.

scSVA is available on GitHub.

Citation: Tabaka M, Gould J,  Regev A (2019). scSVA: an interactive tool for big data visualization and exploration in single-cell omics. bioRxiv; doi:10.1101/512582.


Code & Computational Pipelines



The computational pipeline associated with Perturb-CITE-seq—a protocol to combine perturbation screening with multiplex antibody staining and scRNA-seq—is available on GitHub.

If you use this code in your work, please cite:

Frangieh CJ, Melms JC, Thakore PI, Geiger-Schuller KR, et al. (2021). Multimodal pooled Perturb-CITE-seq screens in patient models define mechanisms of cancer immune evasion. Nature Genetics 53: 332–341; doi:10.1038/s41588-021-00779-1.



inCITE-seq is a multi-omics method to quantify intranuclear protein levels alongside snRNA-seq.

We recently applied inCITE-seq to evaluate changes in transcription factor levels in the mouse hippocampus in response to pharmacological intervention; the associated code is available on GitHub.

If you use this code in your work, please cite:

Chung H, Parkhurst CN, Magee EM, Phillips D, et al. (2021). Joint single-cell measurements of nuclear proteins and RNA in vivo. Nature Methods 18: 1204–1212; doi:10.1038/s41592-021-01278-1.


COVID-19 Tissue Atlases

The code associated with our recent COVID-19 tissue atlases project is available on GitHub.

If you use this code in your work, please cite:

Delorey TM, Ziegler CGK, Heimberg G, Normand R, et al. (2021). COVID-19 tissue atlases reveal SARS-CoV-2 pathology and cellular targets. Nature 595: 107–113; doi:10.1038/s41586-021-03570-8.



SM-Omics is our high-throughput spatial transcriptomics protocol, which can be combined with antibody-based protein profiling methods for multimodal analyses. We recently validated SM-Omics on mouse olfactory bulb and cortex; the associated code is available on GitHub.

If you use this code in your work, please cite:

Vickovic S, Lötstedt B, Klughammer J, Segerstolpe Å, et al. (2020). SM-Omics is an automated platform for high-throughput spatial multi-omics. Nature Communications 13: 795 doi:10.1038/s41467-022-28445-y.