Single-cell measurement of chromatin accessibility (DNA), gene expression (RNA), and proteins has revealed rich cellular diversity across tissues, organisms, and disease states. However, single-cell data poses significant modeling challenges: datasets are high-dimensional in both observations and features with complex sparsity; biological signals are mixed with donor and technical batch effects; and ground truth is scarce relative to other fields where machine learning has shined. Here we leverage recent advances in multi-modal single-cell technologies which, by simultaneously measuring two layers of cellular processing, provide ground truth analogous to language translation. We formalize tasks to predict one modality from another and learn integrated representations of cellular state. We also generate a novel dataset of the human bone marrow specifically designed for benchmarking methods. The dataset and tasks are accessible through an open-source framework that facilitates centralized evaluation of community-submitted methods, and form the basis for a competition at NeurIPS 2021 (openproblems.bio/neurips).