Offerings and projects

The DSP develops software products and operates services that are widely used across the biomedical ecosystem, and plays a pivotal role in several national and international scientific initiatives.

Flagship DSP products and services

  • The Data Use Oversight System (DUOS): A suite of interfaces for managing interactions between data access committees and researchers seeking to access sensitive genomic datasets.
  • GATK: The leading open-source variant discovery package for analysis of high-throughput sequencing data.
  • GVS: Joint calling and variant filtering scaling to more than 500k genomes and counting.
  • Imputation Service: Leveraging the All of Us + AnVIL reference panel to impute genomic data.
  • ML4H: Accelerating the real-world use and impact of artificial intelligence across all areas of medicine.
  • Picard: A popular set of open-source command-line tools for processing high-throughput sequencing data.
  • The Single Cell Portal: Visualize, explore, and publish interactive single and multiomic single cell data.
  • Terra: An open cloud-based platform for accessing data, performing analyses and collaborating securely in the cloud, developed in collaboration with Manifold.ai.
  • Through.bio: Visualize how basic scientific research publications translate into clinical impact.

Flagship scientific projects

  • All of Us Research Program: A National Institutes of Health (NIH)-funded initiative that will recruit 1 million or more U.S. citizens and collect their genomic and clinical data. Broad, in collaboration with Vanderbilt and Verily, has built a Workbench-based platform to store, share, and analyze all data generated as part of the program.
  • AnVIL: NHGRI's Genomic Data Science Analysis, Visualization, and Informatics Lab-Space enables biomedical researchers to access data, run analysis tools, and collaborate at scale. AnVIL is built on many DSP technologies, including Trusted Research environments (Terra) and DUOS.
  • BICAN: An NIH BRAIN Initiative consortium that builds reference brain cell atlases to be widely used throughout the research community, providing a molecular and anatomical foundational framework for the study of brain function and disorders. In partnership with NeMO, DSP builds workflows to uniformly process BICAN atlas data.
  • BioDataCatalyst: A cloud-based ecosystem offering researchers data, analytic tools, applications, and workflows in secure workspaces.
  • DS-I Africa: The Data Science for Health Discovery and Innovation in Africa (DS-I Africa) Initiative aims to leverage data science technologies to transform biomedical and public health research and develop solutions that would lead to improved health for individuals and populations.
  • The Human Cell Atlas (HCA): A global effort to comprehensively characterize a human reference of cell types and cell states. Broad DSP developed and operates parts of the HCA Data Portal, specifically focusing on ingest, storage, and access for managed and unmanaged multimodal sequencing data.
  • LungMAP: Builds an open-access reference resource of a comprehensive, dynamic, 3-D molecular atlas of the late-stage developing human lung with data and reagents available to the research community.
  • SCHARE: A cloud-based research platform for population health research focused on better data for and use of AI, providing training opportunities for data science, and cloud computing resources.
  • SCORCH: A data coordination, analysis, and scientific outreach center established to standardize and share single-cell molecular data to inform pathophysiological understanding of CNS effects of substance use disorders and HIV.