RNA-seq Analysis  Print-icon

Overview

GenePattern offers a set of tools to support a wide variety of RNA-seq analyses, including short-read mapping, identification of splice junctions, transcript and isoform detection, quantitation, differential expression, quality control metrics, visualization, and file utilities. The tools released as GenePattern modules are widely-used. We continue to release new and updated tools as they become available. To be informed when new capabilities are added, check this page or sign up for our Twitter feed.

How to Use the RNA-seq Tools

We recommend that you run these modules on a local GenePattern server, due to the average size of the input files for these modules. You can upload your data, and make use of the new file management features in GenePattern 3.6, but large data will take a while to upload, depending on your connection speed, data size, and current available bandwidth. Alternately, on the public GenePattern server, If you have a GenomeSpace account, and already have data stored there, you can link your GenePattern account with your GenomeSpace account and make use of the improved file management features in GenePattern 3.6.

COMPATIBILITY NOTE: A number of tools are built for Unix-based (Mac and Linux) systems and will not run on Windows machines. They are the Tuxedo suite tools (Bowtie, TopHat, Cufflinks, Cuffmerge, Cuffcompare, and Cuffdiff) and BWA.

You can install a local GenePattern server by doing the following:

  1. If you have not downloaded GenePattern and installed it on your local machine, follow the instructions on the Download GenePattern page.
  2. If you have already downloaded and installed a GenePattern server, you can install any of these modules from the GenePattern public repository, avaliable from  Modules & Pipelines> Install From Repository, in the navigation bar in your GenePattern server.
  3. Enable the browsing of your GenePattern server's file path. This will allow you to send RNA-seq files to GenePattern modules without uploading them. See these instructions for more details.

Internal Broad Institute Server

Broad Institute members and collaborators can use the GPBroad server to send RNA-seq files directly to analysis modules. Community members can contact gp-help@broadinstitute.org to enable access to their RNA-seq files.

Reference Genomes

The TopHat, Bowtie, and BWA GenePattern modules provide pre-built reference genome indexes for a number of species. If you need an index for a species that is not hosted, email us at gp-help@broadinstitute.org. See this FAQ for more information on how to find other reference genome indexes.

Several of the modules accept reference genome annotation files (GTF files) and/or whole genome FASTA files.  A list of these is available on our FTP site:

To use one of these files in a GenePattern module, click the Specify URL radio button under the input box for the GTF file parameter, and paste in the URL for the annotation file you want to use.

RNA-seq Tools in GenePattern

Tuxedo Suite

GenePattern provides support for the Tuxedo suite of Bowtie, Tophat, and Cufflinks, as described in Trapnell et al (2012) (Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks).

Bowtie is short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes. For more information, please refer to the Bowtie documentation. The GenePattern Bowtie modules consist of the following tools:

TopHat is a fast splice junction mapper.  TopHat uses Bowtie to map RNA-seq reads to a reference genome, then analyzes the mapping results to identify splice junctions between exons. For more information about the algorithm, please refer to the TopHat documentation.

Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-seq samples. It accepts aligned RNA-seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one. For more information, please refer to the Cufflinks documentation. Cufflinks contains several accessory tools:

BWA (Version in GenePattern: 0.5.9)

 For more information, please refer to the BWA documentation. The GenePattern BWA modules consist of the following tools:

Scripture

Scripture is a method for transcriptome reconstruction that relies solely on RNA-seq reads and an assembled genome to build a transcriptome ab initio. Scripture has been implemented in GenePattern as a pipeline containing several of the functions wrapped as individual modules. Please note: the modules must be executed as part of the Scripture pipeline. For more information, please refer to the Scripture documentation. Available Scripture pipelines are:

RNA-SeQC

This module calculates useful metrics for determining the quality of RNA-seq data such as depth of coverage, rRNA contamination, continuity of coverage, and GC bias.  For more information, including a suggested workflow for preprocessing your data files, see the in-depth article about RNA-seq QC in GenePattern.

Integrative Genomics Viewer (IGV)

IGV is a visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types including sequence alignments, microarrays, and genomic annotations. For more information, please refer to the IGV documentation.

Picard

The Picard tools are widely-used utilities for manipulating SAM/BAM files, and we have wrapped a number of them for GenePattern.  For more information on the SAM/BAM file format, see the SAMtools page.  For more information about the Picard command-line tools, see the Picard site.

SAMtools

SAMtools are widely-used utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.  We have started to wrap these tools for GenePattern, and will continue to add to the SAMtools modules.  For more information on the SAM/BAM file format or about the SAMtools utilities, see the SAMtools site.

 

Legacy Tool

ExprToGct: This module converts a file in EXPR format to GCT format. The EXPR file format is a tab-delimited format produced by Cufflinks version 1 (deprecated in Cufflinks version 2 and higher).

 

Back to top

Updated on June 26, 2013 13:50