Cancer Program Tool Resources


ABSOLUTE generates absolute copy number data from a mixed population of tumor and normal DNA. This process begins by generation of segmented copy number data, which is input to the ABSOLUTE algorithm together with pre-computed models of recurrent cancer karyotypes and, optionally, allelic fraction values for somatic point mutations.  ABSOLUTE then provides information on the absolute cellular copy number of local DNA segments and, for point mutations, the number of mutated alleles.

Achilles Project

The goal of the Achilles Project is to create a genome-wide catalog of tumor dependencies to identify vulnerabilities associated with genetic or epigenetic alterations.


BreakPointer is a tool to pinpoint rearrangement breakpoints using paired-end next generation sequencing reads.

Broad Internal GenePattern Server

GenePattern is a powerful genomic analysis platform that provides access to more than 180 tools for gene expression analysis, proteomics, SNP analysis, flow cytometry, RNA-seq analysis, and common data processing tasks. A web-based interface provides easy access to these tools and allows the creation of multi-step analysis pipelines that enable reproducible in silico research.
This is a hosted instance of GenePattern that is available only inside the Broad network for Broad Institute researchers and affiliates. 


CellProfiler is free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically. See our papers on analyzing cell images and non-cell images.


ChainFinder is an algorithm for identifying complex sets of DNA rearrangements and deletions in cancer genomes that may reflect coordinate chromosomal alterations.

Connectivity Map (CMap)

The Connectivity Map (or CMap) is a catalog of gene-expression data collected from human cells treated with chemical compounds and genetic reagents. Computational methods to reduce the number of necessary genomic measurements along with streamlined methodologies enable the current effort to significantly increase the size of the CMap database and along with it, our potential to connect human diseases with the genes that underlie them and the drugs that treat them.


ContEst is a tool (and method) for estimating the amount of cross-sample contamination in next generation sequencing data.  Using a Bayesian framework, contamination levels are estimated from array based genotypes and sequencing reads.


D-ToxoG is a tool for removing the OxoG artifact from a set of SNV calls.


dRanger is a tool to identify somatic rearrangements as clusters of aberrant paired-end sequencing reads in a tumor sample where the the normal sample has read-pairs consistent with the reference. Candidate rearrangement breakpoints from dRanger are passed into Breakpointer, which applies a modified Smith-Waterman algorithm to all reads in the region to identify split-read support for  the rearrangement.