Scientists in the Broad community have developed many critical software tools for the analysis of increasingly large genome-related datasets, and they make these tools openly available to the scientific community. For the conditions governing the use of Broad Institute software, please see the software use agreement associated with the tools you choose to download.

Use our search function, browse the complete software collection or click on one of the software categories listed below:

  • Exome Enrichment Automation

    Automation of the Illumina TruSeq Exome Enrichment protocol on an Agilent Bravo liquid handling platform.

  • FRESCo

    FRESCo is a method for finding regions of overlapping function in viral genomes.
  • Gene Set Enrichment Analysis

    GSEA is a computational method that determines if a given set of genes shows statistically significant differences between two biological states. It is useful for interpreting the results of gene expression studies.

  • GENE-E

    GENE-E is a matrix visualization and analysis platform designed to support visual data exploration. It includes heat map, clustering, filtering, charting, marker selection, and many other tools. In addition to supporting generic matrices, GENE-E also contains tools that are designed specifically for RNAi and gene expression data.

  • GeneCruiser

    GeneCruiser is an annotation tool that allows users to map genes from genomic databases to Affymetrix probes, find information about Affymetrix probes in genomic databases, and find where Affymetrix probes are located in the human genome.

  • GeneHunter

    GeneHunter is a tool for rapid extraction of complete multipoint linkage analysis using both parametric and nonparametric approaches.

  • GenePattern

    GenePattern is a powerful genomic analysis platform that provides access to hundreds of tools for gene expression analysis, proteomics, SNP analysis, flow cytometry, RNA-seq analysis, and common data processing tasks. A web-based interface provides easy access to these tools and allows the creation of multi-step analysis pipelines that enable reproducible in silico research..

  • Genome Analysis Toolkit (GATK)

    The Genome Analysis Toolkit (GATK) is a structured programming framework designed to enable the rapid development of efficient and robust analysis tools for next-generation DNA sequencers. The GATK solves the data management challenge by separating data access patterns from analysis algorithms, using the functional programming philosophy of Map/Reduce. Since the GATK’s traversal engine encapsulates the complexity of efficiently accessing the next-generation sequencing data, researchers and developers are free to focus on their specific analysis algorithms. This not only vastly improves the productivity of developers, who can quickly write new analyses, but also results in tools that are efficient and robust, and can benefit from improvements to a common data management engine.

  • Genome Data Analysis Center (GDAC)

    On behalf of The Cancer Genome Atlas, the Broad Genome Data Analysis Center designs and operates scientific data and analysis pipelines which pump terabyte-scale genomic datasets through scores of quantitative algorithms, in the hope of accelerating the understanding of cancer.

  • GenomeSpace

    GenomeSpace is a cloud-based interoperability framework to support integrative genomics analysis through an easy-to-use Web interface. GenomeSpace provides access to a diverse range of bioinformatics tools, and bridges the gaps between the tools, making it easy to leverage the available analyses and visualizations in each of them. The tools retain their native look and feel, with GenomeSpace providing frictionless conduits between them through a lightweight interoperability layer. GenomeSpace does not perform any analyses itself; these are done within the member tools wherever they live – desktop, Web service, cloud, in-house server, etc. Rather, GenomeSpace provides tool selection and launch capabilities, and acts as a data highway automatically reformatting data as required when results move from the output of one tool to input for the next.