GenePattern Frequently Asked Questions

1. Public Server/General

1.1. Visualizers on the GenePattern Public Server don't work for me. Why?
A.

GenePattern visualization modules require Java 1.5. Because the visualizers run on your local computer, you must have Java 1.5 installed. To determine your current Java version, type:

java -version

1.2. What is the latest version of GenePattern?
A.

For information about the latest version of GenePattern and its components, see the Release Notes.

1.3. Where can I find the hardware and software prerequisites for GenePattern?
A.

The Release Notes list hardware requirements, supported operating systems, and supported browsers.

1.4. Where can I find the GenePattern release notes?
A.

Click here for the latest release notes.

1.5. How can I get help with GenePattern or provide feedback?
A.

In addition to this FAQ, the GenePattern team provides the following online resources:

  • Concepts provides a brief introduction to GenePattern and its primary objects: modules, pipelines, suites.
  • Quick Start provides a 10-minute tour of GenePattern.
  • The Tutorial provides an extended 90-minute hands-on tour introduction.
  • The User Guide fully describes GenePattern and how to use it.
  • The Programmer's Guide provides guidelines for writing modules and instructions for accessing GenePattern from the Java, MATLAB, and R programming environments.
  • The Integration Guide provides guidance to system administrators interested in integrating GenePattern into the analysis tools at their site.
  • The Modules page lists the modules and pipelines available from the Broad Institute, with links to their documentation.
  • The File Formats Guide describes all file formats and provides instructions for creating input files.
  • The Release Notes describe new features in the current release.

To provide feedback or ask a question not addressed by the online resources, send email to gp-help@broadinstitute.org.

1.6. How do I cite GenePattern?
A.

To cite GenePattern, please use the following citation:
Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP (2006) GenePattern 2.0 Nature Genetics 38 no. 5 (2006): pp500-501 doi:10.1038/ng0506-500.

To cite a GenePattern analysis or visualization module, cite the GenePattern software and the original paper or other source for the module as specified in the module documentation. Documentation for each module is available on the Modules page and in GenePattern (click Help when prompted to enter the module's parameters).

1.7. If I am a member of the press, how can I get more information?
A.

If you are a member of the press and need additional information about GenePattern, please contact our Public Relations department.

Q.
A.

2. Installation

2.1. How do I install (uninstall) GenePattern?
A.

To install GenePattern, go to GenePattern Download and follow the instructions for your operating system.

To uninstall GenePattern, use the utility provided as part of the GenePattern installation. If the GenePattern uninstall utility is unavailable, deleting the GenePattern installation folder removes all GenePattern files other than the desktop icons.

Mac users: If R2.5 is not already installed, GenePattern installs it in the /Library/Frameworks/R.framework/Versions/2.5 folder. Uninstalling GenePattern does not uninstall R. To uninstall R, use the utility provided by R.

2.2. How do I upgrade to the latest version of GenePattern without losing my modules/pipelines/suites?
A.

Simply install the new version of GenePattern into the same directory as your previous version. Do not uninstall first. It is unnecessary and will delete your existing modules, pipelines and suites. When you overwrite the previous version:

  • Existing modules, pipelines and suites are preserved.
  • The following settings are read from your existing genepattern.properties file and displayed as default values for your new installation: settings for R, Java, Perl, LSID Authority, proxy settings, HSQL database URL and port, and file purge frequency and time.
  • The default value for the webserver (Tomcat) port used by the GenePattern server is always 8080. If your existing installation uses a different port number, you can specify that port number during the installation.
  • The require.password setting from your existing genepattern.properties file is preserved in your new installation.
  • Backup copies of the following configuration files are created: genepattern.properties[.backup], permissionMap.xml[.backup], and userGroups.xml[.backup]. To recreate your previous settings after installing GenePattern, compare the saved files with the newly installed files and modify the new files as necessary. Do not replace the newly installed configuration files with the saved copies.

User groups: The userGroups.xml file for GenePattern 3.2 omits the group named Public. In GenePattern 3.2, all users are now in a predefined group named Public. To avoid confusion, do not recreate the group named Public.

R versions: Installing GenePattern 3.1 (or later) installs R2.5 and sets the full path to R2.5. See Using Different Versions of R for information on how to create and/or use GenePattern modules written for other versions of R.

2.3. Is the source code available?
A.

Yes. Source code for GenePattern and its modules is available under the GenePattern software license. For access to the source code, email the GenePattern team.

2.4. I already have R/Perl/Java on my machine. Will the versions of R/Perl/Java that GenePattern installs interfere with these?
A.

No. The R, Perl, and Java installations that come with the GenePattern are installed within the GenePattern directory and do not affect any other versions that you may currently have.

2.5. Can I configure GenePattern to work with versions of R/Perl/Java other than those installed by GenePattern?
A.

You can configure GenePattern to work with other versions of R/Perl/Java; however, the versions of R, Perl, and Java bundled with GenePattern are the ones that have been fully tested. We cannot guarantee that other versions will work.

Java VM: If you install a GenePattern server without the Java VM, choosing instead to use a Java VM that you have already installed, ensure that the file tools.jar (provided by SUN seperately from the JRE and JDK) is on your classpath. When you install a GenePattern server with an included VM, the GenePattern installation does this for you. If this file is not on your classpath, when you attempt to install a module that requires the MatlabComponentRuntime (MCR) Installer, the MCR Installer fails.

R versions: GenePattern modules can be written for any version of R. For details on how to specify which version to use, see Using Different Versions of R.

2.6. Does GenePattern support the international settings on my computer?
A.

GenePattern supports the Basic Latin character set. Characters other than those in the Basic Latin character set may not be displayed correctly. Asian character sets are not currently supported.

All analysis and visualization modules support the decimal point (.) as the separator between the integral and fractional parts of a decimal number. Using a decimal comma (,) may cause unexpected behavior in some modules.

2.7. I have no modules or pipelines installed. Why? How do I get them?
A.

The most likely reason is that when the installer finished, you missed the step to install the modules that came up in a browser window after the installer closed. To install the modules:

  1. Start GenePattern (http://localhost:8080/gp is usually the URL).
  2. Click Modules & Pipelines>Install from repository. GenePattern displays all of the new and/or updated modules for your system checked.
  3. Click Install Checked to install all available modules onto your GenePattern server, or select only those you are interested in.

2.8. I am behind a web proxy/firewall and my GenePattern server says it cannot connect to the module repository to load the modules. What do I do?
A.

If you did not indicate that you were behind a web proxy/firewall when you installed GenePattern, you must update the proxy settings for your server before you can install the modules:

  1. Start GenePattern (http://localhost:8080/gp is usually the URL).
  2. Click Administration>Server Settings to display the server settings.
  3. Click Proxy to display the proxy settings.
  4. On the Proxy Settings page, enter the hostname and port of your web proxy server. If you do not know them, contact your IT help desk to get the values. If you need to log into the proxy server, also enter your username and password (these will NOT be saved to a file and will need to be reentered following a server restart next time you want to connect).
  5. Click Save to update the proxy settings.
  6. Click Modules & Pipelines>Install from Repository to install the modules.

If you still cannot connect to the repository, email us at gp-help@broadinstitute.org.

2.9. I want to install GenePattern into our corporate/departmental/other Web server and not have GenePattern run in its own Web server. How do I install it?
A.

You need to use the war file installation. Instructions are available here.

2.10. When should I choose to install the GenePattern server on a different port than the default 8080?
A.

If you already have a server such as Tomcat running on this port, you need to install the GenePattern server on a different port to avoid conflicts.

Q.
A.

3. Configuration

3.1. How do I increase the memory allocated to the GenePattern server or client?
A.

See Increasing Memory Allocation.

3.2. How do I increase the memory allocated to a module?
A.

See Increasing Memory Allocation.

3.3. Can I run more than one instance of the GenePattern server on a machine?
A.

Yes. If you are running more than one installation of GenePattern on the same machine, you must make sure that the port numbers for the GenePattern server and the HSQL server are unique to each installation. The Tomcat server listens on two ports, 8080 (requests) and 8005 (shutdown) by default, and the HSQL server listens on port 9001. All 3 ports need to be modified on the second copy of Tomcat. For example, you can set the GenePattern server port to 8080 and 8005 on one install and 8081 and 8086 on the other, and set the HSQL port to 9001 on one and 9002 on the other. You can configure these port numbers when you are installing the server.

3.4. How do I configure GenePattern to use a grid engine?
A.

See Using a Queuing System for information on integrating GenePattern with queuing systems such as the Load Sharing Facility (LSF) and the Sun Grid Engine (SGE).

3.5. How do I configure the GenePattern server on a machine with multiple IP addresses? Can I keep the GenePattern URL from changing when the server hostname changes?
A.

Choose one hostname for the GenePattern server; for example, http://servername.domainname.edu:8080/gp/. Edit the genepattern.properties file and set the following properties:

  • GenePatternURL=http\://servername.domainname.edu:8080/gp/
  • GENEPATTERN_PORT=8080
  • gpServerHostAddress=servername.domainname.edu
  • fqHostName=servername.domainname.edu
  • fullyQualifiedHostName=servername.domainname.edu

3.6. How do I modify the GenePattern session timeout interval?
A.

Session timeout is set in the Tomcat configuration file of the GenePattern server. To modify this setting for a local GenePattern server:

  1. Edit the Tomcat configuration file in the GenePattern installation directory: GenePatternServer/Tomcat/conf/web.xml.
  2. Modify the session-timeout property. Enter a value in minutes, where 0 disables session timeout. For example, to set the timeout to one day:

    <session-config>
       <session-timeout>1440</session-timeout>
    </session-config>

  3. Save the Tomcat configuration file.
  4. Restart the GenePattern server.

On the public GenePattern server, session timeout is set to four hours and cannot be modified by a user.

Q.
A.

4. Data Formats

4.1. Does GenePattern support cDNA and other 2-channel microarray data?
A.

Yes. Most GenePattern analyses can run on 2-channel or ratio-based data as easily as on single channel or absolute value data. To run 2-channel data in GenePattern, do the following:

  • Convert your ratio-based data to a GenePattern GCT file. This tab-delimited text file format contains features (genes or probes), samples, and a computed ratio value for each feature in each sample.

  • GenePattern modules cannot analyze files with missing values. If your data has missing values, one way to address the issue is to use the ImputeMissingValues.KNN module to impute the missing values.

Your data is now in a GCT file that can be analyzed by most GenePattern modules. (If you want to use non-negative matrix factorization (NMF) and your data contains negative values, see the NMF note in the Modules & Pipelines section below.)

Ratio values for cDNA data can be computed using a variety of methods. How the ratios are computed determines whether it is possible to create a class (CLS) file for the cDNA ratio data. For example:

  • If ratios for all samples are computed against a common reference, as shown below, each sample can be assigned a distinct class and it is possible to create a class (CLS) file.

    normal sample (Cy3) / common reference (Cy5) = phenotype 1
    treated sample (Cy3) / common reference (Cy5) = phenotype 2


  • If ratios are computed by comparing conditions, as shown below, it may not be possible to create a CLS file.

    normal sample (Cy3) / treated sample (Cy5) = phenotype

If you cannot create a CLS file, you can analyze your data using modules that do not require class files (such as ConsensusClustering), but will not be able to use modules that require the CLS file (such as ComparativeMarkerSelection).

4.2. Where can I find information about file formats used by GenePattern?
A.

Information on file formats supported by the modules currently in GenePattern is available in File Formats.

4.3. How can I convert between RES, GCT, and ODF formats?
A.

Run your file through PreprocessDataset. Select the desired output format for your file. If you only want to convert the file type without filtering, select "no filter" as the choice for the "filter flag" parameter.

4.4. How do I convert a file to GenePattern format?
A.

File Formats describes the file formats used in GenePattern and, where applicable, suggests methods for converting files to these formats.

4.5. How can I use CEL and MAGE-ML files in GenePattern?
A.

The ExpressionFileCreator module converts a set of individual CEL files into an expression data set that is usable by GenePattern modules. The MAGEMLImportViewer module imports data in MAGE-ML format into GenePattern.

Q.
A.

5. Modules, Pipelines, and Suites

5.1. Can I keep my existing modules, pipelines, and suites when reinstalling or upgrading GenePattern to a new version?
A.

Yes. Simply install the new version of GenePattern into the same directory as your previous version. See FAQ 2.2 for more information.

5.2. I have installed a module/pipeline/suite, but I do not see it. What's wrong?
A.

This generally occurs for one of two reasons:

  • If the same zip file is installed twice, by two users, the second one overwrites the first one. While the bits are the same (including LSID), the ownership and privacy are subject to change and may end up hiding it from the module's original installer if the second installer installs it as private.
  • The same suite cannot be installed as a "private" suite for more than one user. If you install a private suite and do not see it, it may already be installed as a private suite by another user.

5.3. My pipeline requires an input file, but displays a file-not-found error when I enter a file name. What's wrong?
A.

Pipeline input files with spaces in their names may give file-not-found errors. If this happens, use DOS' "dir /x" command to get the 8.3 version of the directory and filename and use that instead of the long filename. If you are using a Unix-based platform, you may need to quote the filename parameters on the command line definition.

5.4. What are the pipelines whose names start with Lu.Getz.Miska.Nature.June.2005?
A.

These pipelines are the actual analyses performed in the Nature paper, "MicroRNA Expression Profiles Classify Human Cancers" by Lu, Getz, Miska, et. al. Running these pipelines will exactly reproduce their results. You can also tweak the parameters used to see for yourself how their results change if they had performed any step in their analyses differently. Additional information about the paper is available at http://www.broadinstitute.org/cancer/pub/miGCM.

5.5. How can I run non-negative matrix factorization NMF on data that contains negative values, such as log-ratio or unthresholded Affymetrix data?
A.

To run NMF on data that contains negative values, you must do the following (using the method of Kim, P. M. & Tidor, B. (2003) Genome Res. 13, 1706-1718):

  • Create one dataset with all negative numbers zeroed
  • Create another dataset with all positive numbers zeroed and the signs of all negative numbers removed
  • Merge the two (eg. by concatenation), resulting in a dataset twice as large as the original, but with positive values only and zeros, hence appropriate for NMF.

To do this in MATLAB, you can execute the following:
anew=[max(a,0);-min(a,0)];
where a is the original data.
We are currently developing a GenePattern module to perform this operation as well.

5.6. When I do a Hierarchical Clustering analysis, two files are produced, but the Hierarchical Cluster Viewer (JavaTreeView) looks like it needs three files. Do I need another one?
A.

No, you can use the two files that are created and leave the remaining input box blank. HierarchicalClustering creates a cdt file and one or two additional files: an atr file if you clustered by samples (columns), a gtr file if you clustered by genes (rows), or both atr and gtr files if you clustered by both samples and genes (columns and rows). The JavaTreeView module accepts the two or three files created by HierarchicalClustering.

5.7. How can I export a Heat Map image with gene annotations?
A.

The HeatMapViewer module currently does not include gene annotations with the saved image. Use the HeatMapImage module to include gene annotations.

5.8. Why do the scores from ComparativeMarkerSelection and ClassNeighbors differ?
A.

When computing the t-test or signal to noise ratio, ClassNeighbors thresholds the standard deviation to ensure that it is at least twenty percent of the mean. Additionally, if the standard deviation is zero, ClassNeighbors sets it to 0.1.

5.9. I have used comparative marker selection to construct gene lists representing different experimental conditions. Is there a GenePattern module that can determine if there are upstream non-coding motifs over represented in those gene lists?
A.

Yes. You can use the GSEA module with the c3 (motif) gene sets. The GSEA module is documented on the Modules page.

5.10. How do I view the 3D visualization in the PCAViewer?
A.

You must install Java 3D (https://java3d.dev.java.net/binary-builds.html).

Q.
A.

6. Module Creation and Integration

6.1. How can I retrieve external database information from GenePattern?
A.

The GenePattern server itself does not connect to any database, but modules can and have been written to connect to databases and retrieve data from them including caArray (caArrayImportViewer) and Gene Expression Omnibus (GEOImporter). To connect to any database of your choice, write a simple command-line program to connect to the database and retrieve data into a file format and install this program as a module into GenePattern (see Creating Modules).

6.2. My MATLAB figures are not appearing in the MATLAB visualizer I created. Why?
A.

When creating a matlab visualizer using matlab 7.0 compiled m-code (any release before 7.4), any figures that you create in MATLAB must have the value visible set to on or they will not be drawn to the screen.

6.3. Can my module use a different version of R than GenePattern?
A.

GenePattern modules can be written for any version of R. For details on how to specify which version to use, see Using Different Versions of R.

Q.
A.

7. Programming Language Environments

7.1. Where can I find out more about how to launch GenePattern modules from other programming languages?
A.

The reference guide for accessing GenePattern modules from Java, MATLAB, and R is the GenePattern Programmer's Guide.

7.2. Can I use the GenePattern APIs to create a web service that programmatically accesses the GenePattern server?
A.

GenePattern is based on a web services API already so you may not need to create a new web service for this purpose. The WSDL for the GenePattern server is available at

http://your_server:your_port/gp/services

To get an easy start on creating a web services client to GenePattern:

  1. Start GenePattern (http://localhost:8080/gp is usually the URL).
  2. Create a one-step pipeline (see Creating Pipelines).
  3. Select the pipeline that you just created. GenePattern displays the parameters (if any) for the pipeline.
  4. At the bottom of that page, select your favorite programming language and click View Code. GenePattern generates the code required to run the pipeline.
  5. Select Downloads>Programming Libraries to download the GenePattern programming library for your favorite programming language.
  6. Compile and run the generated code.

You can then modify the pipeline code to do what your application needs.

For more information about the programming libraries, see the GenePattern Programmer's Guide.

7.3. Why can't I call my pipeline/module from MATLAB?
A.

A pipeline or module with a period in its name cannot be called from MATLAB.

Q.
A.

Other

If you haven't found what you are looking for, please send an email to gp-help@broadinstitute.org.