Cancer Program Tool Resources


ABSOLUTE generates absolute copy number data from a mixed population of tumor and normal DNA. This process begins by generation of segmented copy number data, which is input to the ABSOLUTE algorithm together with pre-computed models of recurrent cancer karyotypes and, optionally, allelic fraction values for somatic point mutations.  ABSOLUTE then provides information on the absolute cellular copy number of local DNA segments and, for point mutations, the number of mutated alleles.

Achilles Project

The goal of the Achilles Project is to create a genome-wide catalog of tumor dependencies to identify vulnerabilities associated with genetic or epigenetic alterations.


BreakPointer is a tool to pinpoint rearrangement breakpoints using paired-end next generation sequencing reads.

Broad Internal GenePattern Server

GenePattern is a powerful genomic analysis platform that provides access to more than 180 tools for gene expression analysis, proteomics, SNP analysis, flow cytometry, RNA-seq analysis, and common data processing tasks. A web-based interface provides easy access to these tools and allows the creation of multi-step analysis pipelines that enable reproducible in silico research.
This is a hosted instance of GenePattern that is available only inside the Broad network for Broad Institute researchers and affiliates. 


ChainFinder is an algorithm for identifying complex sets of DNA rearrangements and deletions in cancer genomes that may reflect coordinate chromosomal alterations.

Connectivity Map (CMap)

The Connectivity Map (or CMap) is a catalog of gene-expression data collected from human cells treated with chemical compounds and genetic reagents. Computational methods to reduce the number of necessary genomic measurements along with streamlined methodologies enable the current effort to significantly increase the size of the CMap database and along with it, our potential to connect human diseases with the genes that underlie them and the drugs that treat them.


ContEst is a tool (and method) for estimating the amount of cross-sample contamination in next generation sequencing data.  Using a Bayesian framework, contamination levels are estimated from array based genotypes and sequencing reads.


D-ToxoG is a tool for removing the OxoG artifact from a set of SNV calls.


dRanger is a tool to identify somatic rearrangements as clusters of aberrant paired-end sequencing reads in a tumor sample where the the normal sample has read-pairs consistent with the reference. Candidate rearrangement breakpoints from dRanger are passed into Breakpointer, which applies a modified Smith-Waterman algorithm to all reads in the region to identify split-read support for  the rearrangement. 


FireBrowse is a simple and elegant way to explore cancer data, backed by a powerful computational infrastructure, application programming interface (API), graphical tools and online reports.  It sits above the TCGA GDAC Firehose, one of the deepest and most integratively characterized open cancer datasets in the world--with over 80K sample aliquots from 11,000+ cancer patients, spanning 38 unique disease cohorts.  FireBrowse makes it possible to find any of thousands of data archives generated by Firehose in just 2 clicks.  Likewise, two clicks are all that's needed to find any of the ~1500 analysis resports created by Firehose in each analysis run.  For programmers a powerful RESTful API is provided, with bindings to the UNIX command line, Python and R.  And for scientists we provide graphical tools like viewGene to explore expression levels, and iCoMut to explore the comprehensive analysis profile of each TCGA disease study within a single, interactive figure.