GenePattern provides a comprehensive computational genomics environment that (1) provides a repository of over 120 analytic and visualization tools; (2) allows easy creation of complex analytic workflows from them; (3) has a rapid, form-based method for integrating new tools, written in any language, into its repository with no additional software engineering; (4) captures the complete history of a user’s analysis enabling reproducible in silico research; and (5) supports programming and nonprogramming users. Since its first release, GenePattern has become a research community standard, supporting thousands of users (Table 1). GenePattern is freely available to both academia and industry and is used in 82 countries. It can be downloaded for local installation or, as of February 2008, used through a Web browser on a public server hosted at the Broad.
Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes). GSEA is available for any operating system (Windows, Mac, Linux) in a variety of implementations, including a Java desktop application, a jar file, an R package, or a GenePattern module.
GeneCruiser provides integrated access to the gene and microarray feature information freely available from public databases. It allows users to retrieve annotations from various databases (LocusLink, RefSeq, SwissProt) using Affymetrix probe identifiers, or convert genes from accession numbers or keywords into matching probe IDs on one or more microarray chips. GeneCruiser is a web service accessible through a web interface, and is also available as a GenePattern module.
The Connectivity Map (also known as cmap) is a collection of genome-wide transcriptional expression data from cultured human cells treated with bioactive small molecules and simple pattern-matching algorithms that together enable the discovery of decisive functional connections between drugs, genes and diseases through the transitory feature of common gene expression changes. It is deployed through a web interface designed to allow biologists, pharmacologists, chemists and clinical scientists the opportunity to use cmap without the need for any specialist ability in the analysis of gene expression data.
The Integrated Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated datasets. It supports a wide variety of data types including sequence alignments, microarrays, and genomic annotations.
The ICBP Data Portal is a web application where ICBP researchers can view a catalog of data created by ICBP members, find and search by a comprehensive set of phenotypic attributes, and apply a number of common analyses and visualizations. The pilot portal focuses on data and analyses that support the characterization of the ICBP50 breast cancer cell lines. This was not intended to be the sole purpose of the portal but was chosen because of the use of the ICBP50 by several ICBP centers in a wide variety of assay types.
GENE-E (Gene Experiment): GENE-E is a Java desktop application developed to allow the rapid visual exploration of data sets derived from RNAi and chemical screens. GENE-E’s user interface is based on an interactive heat map that allows users to easily highlight and drill down to regions of interest. GENE-E was designed to display data in an integrated fashion. For example, users can simultaneously view dose response data for large numbers of cell lines and compounds or genetic perturbations, known mutations in the cell lines, category information for compounds (e.g., EGFR inhibitors), statistical information about the cell lines and compounds, and other metadata about the compounds, such as the source of the compound or RNAi construct.