# GenePattern FAQ

### Other

If you haven't found what you are looking for, please send an email to gp-help(at)broadinstitute.org.

#### Where can I find the hardware and software prerequisites for GenePattern?

The Release Notes list hardware requirements, supported operating systems, and supported browsers.

#### How can I get help with GenePattern or provide feedback?

In addition to this FAQ, the GenePattern team provides the following online resources:

• Video tutorials
• Concepts provides a brief introduction to GenePattern and its primary objects: modules, pipelines, suites.
• Quick Start provides a 10-minute tour of GenePattern.
• The Tutorial provides an extended 40-minute hands-on introduction to expression analysis in GenePattern.
• The User Guide describes how to run analyses, create pipelines and generally work with GenePattern.
• The Administrators Guide describes how to configure a local or networked server. If you are using the GenePattern public server, you do not need this information.
• The Programmers Guide provides guidelines for writing modules and instructions for accessing GenePattern from the Java, MATLAB, and R programming environments.
• The Modules page lists the modules and pipelines available from the Broad Institute, with links to their documentation.
• The File Formats Guide describes all file formats and provides instructions for creating input files.
• The Release Notes describe new features in the current release.

To provide feedback or ask a question not addressed by the online resources, send email to gp-help(at)broadinstitute.org.

#### How do I cite GenePattern?

To cite GenePattern, please use the following citation:
Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP (2006) GenePattern 2.0 Nature Genetics 38 no. 5 (2006): pp500-501 doi:10.1038/ng0506-500.

To cite a GenePattern analysis or visualization module, cite the GenePattern software and the original paper or other source for the module as specified in the module documentation. Documentation for each module is available on the Modules page and in GenePattern (click Help when prompted to enter the module's parameters).

#### What can I do if I my job keeps running out of memory or fails with no error on the public server?

Sometimes jobs with large datasets or parameter settings which cause greater computational load on the system can fail because they ran out of memory. Usually you will get an error message stating that the module ran out of memory, though sometimes this can cause a "silent" failure; meaning that you see that your job failed, but there are no error files. In either case, GenePattern administrators can track down and resolve the problem. GenePattern administrators can assign memory settings on a per user basis, allowing your jobs to run with the required amount of memory for your analysis.

#### Are there other public GenePattern servers?

The public server maintained by the GenePattern development team can be found at http://genepattern.broadinstitute.org. There are also other GenePattern servers maintained by other organizations that have been made publicly available. These organizations include:

• SMD:Stanford Microarray Database
The SMD provides an extensive microarray database and has integrated with GenePattern to provided tools for the analysis of the data. More information about SMD can be found at http://smd.stanford.edu and the paper that describes the integration of GenePattern in their environment can be found here. Please send questions and comments regarding SMD to array(at)genome.stanford.edu.
• NuGO: NBX
NuGO has developed a Black Box environment that utilizes GenePattern as its preferred analysis tool and has deployed NuGO-modified versions of some GenePattern modules on the GenePattern servers installed on the Black Boxes. For more information about the NuGO NBX please visit http://www.nugo.org/NBX/. The paper that describes the modules NuGO has contributed to GenePattern can be found here. Any comments or questions regarding the NuGO NBX and the NuGO GenePattern modules should be directed to sian.astley(at)bbsrc.ac.uk.
• Garvan Institute
The Peter Wills Bioinformatics Centre at the Garvan Institute of Medical Research in Sydney, Australia, has set up a public GenePattern server here.

#### Is there a version of GenePattern that can run in the cloud?

Yes. The GenePattern team is doing final testing on a GenePattern Amazon Machine Instance (AMI) which is expected to be available soon.

#### How do I install (uninstall) GenePattern?

To uninstall GenePattern, use the utility provided as part of the GenePattern installation. If the GenePattern uninstall utility is unavailable, deleting the GenePattern installation folder removes all GenePattern files other than the desktop icons.

Mac users: If R2.5 is not already installed, GenePattern installs it in the /Library/Frameworks/R.framework/Versions/2.5 folder. Uninstalling GenePattern does not uninstall R. To uninstall R, use the utility provided by R.

Simply install the new version of GenePattern into the same directory as your previous version. Do not uninstall first. It is unnecessary and will delete your existing modules, pipelines and suites. When you overwrite the previous version:

• Existing modules, pipelines and suites are preserved.
• The following settings are read from your existing genepattern.properties file and displayed as default values for your new installation: settings for R, Java, Perl, LSID Authority, proxy settings, HSQL database URL and port, and file purge frequency and time.
• The default value for the webserver (Tomcat) port used by the GenePattern server is always 8080. If your existing installation uses a different port number, you can specify that port number during the installation.
• Backup copies of the following configuration files are created: genepattern.properties[.backup (before GenePattern 3.4) or .save (3.4 and up)], permissionMap.xml[.backup], and userGroups.xml[.backup]. To recreate your previous settings after installing GenePattern, compare the saved files with the newly installed files and modify the new files as necessary. Do not replace the newly installed configuration files with the saved copies.

User groups: The userGroups.xml file for GenePattern 3.2 omits the group named Public. In GenePattern 3.2, all users are now in a predefined group named Public. To avoid confusion, do not recreate the group named Public.

R versions: Installing GenePattern 3.1 (or later) installs R2.5 and sets the full path to R2.5. See Using Different Versions of R for information on how to create and/or use GenePattern modules written for other versions of R.

#### What is the recommended method of upgrading my GenePattern server to 3.3.3 or higher?

Large uploaded data files or output files will significantly slow down your GenePattern upgrade installation. If you have less than approximately 10 GB of data files either uploaded (via the Upload tab) or output by GenePattern jobs, you can just follow the GenePattern server installation instructions. However, if you have more than 10 GB of uploaded data files or output files, we suggest that you:

mv <GenePatternServer>/Tomcat/temp <GenePatternServer>_data/temp

2. Edit the following property in both StartGenePatternServer.lax and genepattern.properties:

java.io.tmpdir=<GenePatternServer>_data/temp

Replace <GenePatternServer> with an actual path.

3. Then follow the GenePattern server installation instructions.

#### Is the source code available?

Yes. Source code for GenePattern and its modules is available under the GenePattern software license. For access to the source code, email the GenePattern team.

#### I already have R/Perl/Java on my machine. Will the versions of R/Perl/Java that GenePattern installs interfere with these?

No. The R, Perl, and Java installations that come with the GenePattern are installed within the GenePattern directory and do not affect any other versions that you may currently have.

#### Can I configure GenePattern to work with versions of R/Perl/Java other than those installed by GenePattern?

You can configure GenePattern to work with other versions of R/Perl/Java; however, the versions of R, Perl, and Java bundled with GenePattern are the ones that have been fully tested. We cannot guarantee that other versions will work.

Java VM: If you install a GenePattern server without the Java VM, choosing instead to use a Java VM that you have already installed, ensure that the file tools.jar (provided by SUN seperately from the JRE and JDK) is on your classpath. When you install a GenePattern server with an included VM, the GenePattern installation does this for you. If this file is not on your classpath, when you attempt to install a module that requires the MatlabComponentRuntime (MCR) Installer, the MCR Installer fails.

R versions: GenePattern modules can be written for any version of R. For details on how to specify which version to use, see Using Different Versions of R.

#### Does GenePattern support the international settings on my computer?

GenePattern supports the Basic Latin character set. Characters other than those in the Basic Latin character set may not be displayed correctly. Asian character sets are not currently supported.

All analysis and visualization modules support the decimal point (.) as the separator between the integral and fractional parts of a decimal number. Using a decimal comma (,) may cause unexpected behavior in some modules.

#### I am behind a web proxy/firewall and my GenePattern server says it cannot connect to the module repository to load the modules. What do I do?

If you did not indicate that you were behind a web proxy/firewall when you installed GenePattern, you must update the proxy settings for your server before you can install the modules:

1. Start GenePattern (http://localhost:8080/gp is usually the URL).
2. Click Administration>Server Settings to display the server settings.
3. Click Proxy to display the proxy settings.
4. On the Proxy Settings page, enter the hostname and port of your web proxy server. If you do not know them, contact your IT help desk to get the values. If you need to log into the proxy server, also enter your username and password (these will NOT be saved to a file and will need to be reentered following a server restart next time you want to connect).
5. Click Save to update the proxy settings.
6. Click Modules & Pipelines>Install from Repository to install the modules.

If you still cannot connect to the repository, email us at gp-help(at)broadinstitute.org.

#### I want to install GenePattern into our corporate/departmental/other Web server and not have GenePattern run in its own Web server. How do I install it?

You need to use the war file installation. Instructions are available here.

#### When should I choose to install the GenePattern server on a different port than the default 8080?

If you already have a server such as Tomcat running on this port, you need to install the GenePattern server on a different port to avoid conflicts.

#### How do I install GenePattern on a 64-bit Windows machine if I want to use a version before 3.2.2?

When GenePattern is installed on Windows 64-bit systems in the default C:Program Files (x86) directory, modules fail because of some code that is expecting only "C:Program Files" and then truncates that location to "C:Progr ~1". There is similar bug in ComparativeMarkerSelection. These errors are corrected in the 3.2.2 release of GenePattern. However, if you do not upgrade to release 3.2.2 or after, the work around is to re-install GenePattern in a directory that has no spaces in the name.

#### Why don't MATLAB modules work on my Windows 64-bit GenePattern server?

Installing MATLAB to "Program Files" or "Program Files (x86)" may cause the MATLAB installation to be incomplete. The installation will not fail, but modules will throw an exception, reporting a missing file and then fail. To fix this manually, re-run the MATLAB installer and choose a new installation directory that contains no spaces.

To do this you'll need to first uninstall MATLAB, else it will only let you "repair" the installation in the current directory. Steps to uninstall MATLAB are as follows:

1. 1. Uninstall MATLAB (via Add/Remove Programs)
2. Rerun the .msi, found in GP/patches/...../*.msi, and choose a new install directory with no spaces - like the default GenePattern install directory at Windows: C:GenePatternServer.

If you've uninstalled GP and now need to uninstall the MATLAB that was installed via the now deleted GenePattern:

1. 1. Reinstall GenePattern to a directory with no spaces
2. Install MATLAB module
3. Repair installation
4. Run the module to be sure MATLAB installed correctly
5. Use Add/Remove programs to uninstall MATLAB

You might need to reboot to get the system to let you uninstall the MCR files.

Or you can email us at gp-help(at)broadinstitute.org and we can send you the .msi which you can then point the uninstaller to, through Add/Remove Programs.

#### Why can't I connect to my GenePattern server on Windows 7 or Vista?

On Windows 7 and Vista, the StartGenePatternServer and StopGenePatternServer applications must be run as an administrator. To start or stop the GenePattern server, right-click on StartGenePatternServer.exe or StopGenePatternServer.exe and select Run as administrator.

To launch GenePattern in your browser, you can double-click the GenePatternHome.html icon located with the StartGenePatternServer and StopGenePatternServer icons.

#### How can I set up a GenePattern server for others to use remotely?

There are two useful sections of the GenePattern User Guide that explain how to do this:

#### Why doesn't clicking StartGenePatternServer launch GenePattern?

The StartGenePatternServer application only starts the server. To access the web client interface for your GenePattern server, click the GenePatternHome.html shortcut icon, or, if you did not install icons in your task bar or on your desktop, GenePatternHome.html can be found at the top level of your GenePattern install directory.

#### How can I get R to install correctly on my Mac?

Some Mac users have found that the R library is not installing correctly when they try to install GenePattern. Even after making sure that the folder into which GenePattern is installing R has write permissions, upon running a module, they receive the following error message:

java.io.IOException: Cannot run program "/Library/Frameworks/R.framework/Versions/2.5/Resources/bin/R": error=2,
No such file or directory while running R command [/Library/Frameworks/R.framework/Versions/2.5/Resources/bin/R,
--no--save, --quiet, --slave, --no-restore]

This may be a simple GenePattern server configuration problem. First, check that something is installed at that path. Open the Terminal.app and run the following commands:

ls /Library/Frameworks/R.framework/Versions/2.5

ls /Library/Frameworks/R.framework/Versions

If there is something installed at this path, then check that the path to R is correctly configured in your GenePattern server. Go to the Administration>Server Settings>Programming Languages GenePattern page and verify that:

R 2.5 Home: /Library/Frameworks/R.framework/Versions/2.5/Resources

If this is configured correctly, you may be able to correct the problem by manually downloading and installing R 2.5.

#### Can I run more than one instance of the GenePattern server on a machine?

Yes. If you are running more than one installation of GenePattern on the same machine, you must make sure that the port numbers for the GenePattern server and the HSQL server are unique to each installation. The Tomcat server listens on two ports, 8080 (requests) and 8005 (shutdown) by default, and the HSQL server listens on port 9001. All 3 ports need to be modified on the second copy of Tomcat. For example, you can set the GenePattern server port to 8080 and 8005 on one install and 8081 and 8086 on the other, and set the HSQL port to 9001 on one and 9002 on the other. You can configure these port numbers when you are installing the server.

#### How do I configure the GenePattern server on a machine with multiple IP addresses? Can I keep the GenePattern URL from changing when the server hostname changes?

Choose one hostname for the GenePattern server; for example, http://servername.domainname.edu:8080/gp/. Edit the genepattern.properties file and set the following properties:

• GenePatternURL=http\://servername.domainname.edu:8080/gp/
• GENEPATTERN_PORT=8080
• fqHostName=servername.domainname.edu
• fullyQualifiedHostName=servername.domainname.edu

#### How do I modify the GenePattern session timeout interval?

Session timeout is set in the Tomcat configuration file of the GenePattern server. To modify this setting for a local GenePattern server:

1. Edit the Tomcat configuration file in the GenePattern installation directory: GenePatternServer/Tomcat/conf/web.xml.
2. Modify the session-timeout property. Enter a value in minutes, where 0 disables session timeout. For example, to set the timeout to one day:

<session-config>
<session-timeout>1440</session-timeout>
</session-config>

3. Save the Tomcat configuration file.
4. Restart the GenePattern server.

On the public GenePattern server, session timeout is set to four hours and cannot be modified by a user.

#### How do I configure GenePattern to work with a queuing system (or grid engine)?

Queuing systems such as the Load Sharing Facility (LSF) and the Sun Grid Engine (SGE) allow computational resources to be used effectively. If you have such a queuing system installed at your site and you have installed a local GenePattern server, you can configure the GenePattern server to work with the queuing system. For instructions on how to do so, see Using a Queuing System.

#### How do I modify how often result files are deleted from my GenePattern server?

From your GenePattern server, go to Administration>Server Settings and click File Purge in the menu at the left. From here you can specify when anaylsis result files are deleted from the server:

• Use Purge Jobs After to specify the number of days the server keeps the analysis result files. To prevent the server from automatically deleting the files, set this value to -1.
• Use Purge Time to specify what time of day (24-hour format) the server deletes the files.

Click Save to save your changes. Click Restore to return to the values set at the installation.

Note: This setting can only be modified on local GenePattern servers for which you have administrative rights. You cannot change this setting on the Public Server.

#### I'm getting 'error "connection refused"': what is the problem?

A refused connection is most likely due to a proxy issue. If you are behind a proxy or firewall, verify that you have correctly configured GenePattern and/or talked with your local SysAdmin allow GenePattern access to your machine.

To configure a proxy connection in GenePattern please do the following:

1. In the GenePattern Web Client, click Administration>Server Settings to display the server settings.
2. Click Proxy to display the proxy settings.
3. On the Proxy Settings page, enter the hostname and port of your web proxy server. If you do not know them, contact your IT help desk to get the values. If you need to log into the proxy server, also enter your username and password (these will NOT be saved to a file and will need to be reentered following a server restart next time you want to connect).
4. Click Save to update the proxy settings.

#### Can I use a file path as input for a GenePattern module?

If you install your own GenePattern server, the default setting is not to allow input file paths. To change this, if you have administrator privileges on the server, add or edit the following in your genepattern.properties file:

allow.input.file.paths=true

Then restart your server. This will allow users to input an arbitrary network file path (such as file:///server/directory/file.gct) as the value for an input file parameter. When input file paths are allowed, you can use the server.browse.file.system.root property to set a root directory where the GenePattern server begins browsing for the specified network file path.

Note: On the Broad public server, we prevent users from entering an input file path (file://urls) as an input file for a module in order to better secure the machine running the public server.

#### How can I work around a LaunchAnywhere error?

If you tried to install GenePattern on Ubuntu, you may have received an installation error: "An internal LaunchAnywhere application error has occurred and this application cannot proceed. (LAX)" with "java.lang.IllegalArgumentException: Malformed \uxxxx encoding." in the stack trace.

LaunchAnywhere can interfere with the prompt string formatter PS1. In order to work around this problem, you need to use the following command:

$export PS1=">" >sudo sh./GPServer.bin  This is not only important for installing GenePattern on Ubuntu, but also launching GenePatternServer. Use the command before the GenePatternServer startup command, like so: $ export PS1=">"
>./StartGenePatternServer


#### Why aren't my memory intensive modules working on my 64-bit Windows machine?

32-bit Windows machines only allow you to allocate up to 1.2GB of RAM to processes. 64-bit Windows will allow for more (depending on how much you have installed), but to run memory-intensive Java modules, you must install 64-bit Java and update your GenePattern server's maximum memory allocation (Refer to this information on increasing memory allocation).

First, install 64-bit Java. Then, to configure GenePattern to use 64-bit Java:

1. Stop the GenePattern server.
2. Edit genepattern.properties (located in the resources subdirectory of the GenePattern server directory) so that the 64-bit Java installation location is now the Java parameter value: java=C\:/Program Files/Java/jre6/bin/java
3. Edit the configuration file GenePatternServer/StartGenePatternServer.lax.
4. Look for the entries noted below in this file and increase these values (for example, double the value) up to the maximum memory size of the machine you are using. (Note: Windows limits the total space available to a process to 2 GB. Some of that is used for overhead, so slightly less is really available to the JRE.)
• lax.nl.java.option.java.heap.size.initial
• lax.nl.java.option.java.heap.size.max
5. Start the GenePattern server.

#### What does an "unknown program" error mean?

If you tried to launch a visualizer with Internet Explorer and received the following error:

User Access Control, unknown program, file origins: downloaded from Internet.

this is a Java error. You can manually install the latest version of Java from java.com (Windows) or from Apple (Macintosh), or you can try another browser, such as Google Chrome.

#### Why does the Specify File Path or URL option not work in Internet Explorer?

This is a known issue: when users click the Browse Server File System button, the Internet Explorer web browser window (instead of a pop-up window) becomes the file system browser.

If you want to continue using Internet Explorer, you can copy and paste or manually enter the server file path rather than clicking the Browse Server File System button. We recommend using another browser for full functionality.

#### Does GenePattern support cDNA and other 2-channel microarray data?

Yes. Most GenePattern analyses can run on 2-channel or ratio-based data as easily as on single channel or absolute value data. To run 2-channel data in GenePattern, do the following:

• Convert your ratio-based data to a GenePattern GCT file. This tab-delimited text file format contains features (genes or probes), samples, and a computed ratio value for each feature in each sample.
• GenePattern modules cannot analyze files with missing values. If your data has missing values, one way to address the issue is to use the ImputeMissingValues.KNN module to impute the missing values.

Your data is now in a GCT file that can be analyzed by most GenePattern modules. (If you want to use non-negative matrix factorization (NMF) and your data contains negative values, see the NMF note in the Modules & Pipelines section below.)

Ratio values for cDNA data can be computed using a variety of methods. How the ratios are computed determines whether it is possible to create a class (CLS) file for the cDNA ratio data. For example:

• If ratios for all samples are computed against a common reference, as shown below, each sample can be assigned a distinct class and it is possible to create a class (CLS) file.

normal sample (Cy3) / common reference (Cy5) = phenotype 1
treated sample (Cy3) / common reference (Cy5) = phenotype 2

• If ratios are computed by comparing conditions, as shown below, it may not be possible to create a CLS file.

normal sample (Cy3) / treated sample (Cy5) = phenotype

If you cannot create a CLS file, you can analyze your data using modules that do not require class files (such as ConsensusClustering), but will not be able to use modules that require the CLS file (such as ComparativeMarkerSelection).

#### Where can I find information about file formats used by GenePattern?

Information on file formats supported by the modules currently in GenePattern is available in File Formats.

#### How can I convert between RES, GCT, and ODF formats?

Run your file through PreprocessDataset. Select the desired output format for your file. If you only want to convert the file type without filtering, select "no filter" as the choice for the "filter flag" parameter.

#### How do I convert a file to GenePattern format?

File Formats describes the file formats used in GenePattern and, where applicable, suggests methods for converting files to these formats.

#### How can I use CEL, MAGE-ML, and MAGE-TAB files in GenePattern?

The ExpressionFileCreator module converts a set of individual CEL files into an expression data set that is usable by GenePattern modules. The MAGEMLImportViewer module imports data in MAGE-ML format into GenePattern, and similarly, the MAGETABImportViewer module imports data in MAGE-TAB format into GenePattern.

#### I have installed a module/pipeline/suite, but I do not see it. What's wrong?

This generally occurs for one of two reasons:

• If the same zip file is installed twice, by two users, the second one overwrites the first one. While the bits are the same (including LSID), the ownership and privacy are subject to change and may end up hiding it from the module's original installer if the second installer installs it as private.
• The same suite cannot be installed as a "private" suite for more than one user. If you install a private suite and do not see it, it may already be installed as a private suite by another user.

#### My pipeline requires an input file, but displays a file-not-found error when I enter a file name. What's wrong?

Pipeline input files with spaces in their names may give file-not-found errors. If this happens, use DOS' "dir /x" command to get the 8.3 version of the directory and filename and use that instead of the long filename. If you are using a Unix-based platform, you may need to quote the filename parameters on the command line definition.

#### How can I run non-negative matrix factorization NMF on data that contains negative values, such as log-ratio or unthresholded Affymetrix data?

To run NMF on data that contains negative values, you must do the following (using the method of Kim, P. M. & Tidor, B. (2003) Genome Res. 13, 1706-1718):

• Create one dataset with all negative numbers zeroed
• Create another dataset with all positive numbers zeroed and the signs of all negative numbers removed
• Merge the two (eg. by concatenation), resulting in a dataset twice as large as the original, but with positive values only and zeros, hence appropriate for NMF.

To do this in MATLAB, you can execute the following:  anew=[max(a,0);-min(a,0)]; where a is the original data.
We are currently developing a GenePattern module to perform this operation as well.

#### When I do a Hierarchical Clustering analysis, two files are produced, but the Hierarchical Cluster Viewer (JavaTreeView) looks like it needs three files. Do I need another one?

No, you can use the two files that are created and leave the remaining input box blank. HierarchicalClustering creates a cdt file and one or two additional files: an atr file if you clustered by samples (columns), a gtr file if you clustered by genes (rows), or both atr and gtr files if you clustered by both samples and genes (columns and rows). The JavaTreeView module accepts the two or three files created by HierarchicalClustering.

#### How can I export a Heat Map image with gene annotations?

The HeatMapViewer module currently does not include gene annotations with the saved image. Use the HeatMapImage module to include gene annotations.

#### Why do the scores from ComparativeMarkerSelection and ClassNeighbors differ?

When computing the t-test or signal to noise ratio, ClassNeighbors thresholds the standard deviation to ensure that it is at least twenty percent of the mean. Additionally, if the standard deviation is zero, ClassNeighbors sets it to 0.1.

#### I have used ComparativeMarkerSelection to construct gene lists representing different experimental conditions. Is there a GenePattern module that can determine if there are upstream non-coding motifs over represented in those gene lists?

Yes. You can use the GSEA module with the c3 (motif) gene sets. The GSEA module is documented on the Modules page.

#### How do I view the 3D visualization in the PCAViewer?

You must install Java 3D (https://java3d.dev.java.net/binary-builds.html).

#### How do I resolve GISTIC errors?

Most errors reported by users running the GISTIC module are caused by a mismatch between the segmentation and markers files. If an error occurs, verify that all markers indicated in the segmentation file appear in the markers file and only those markers indicated by the segmentation file appear in the markers file.

The CBS and GLAD segmentation methods produce GISTIC-friendly marker positions. Partek's latest beta version also produces GISTIC-friendly marker positions. However, if you used an earlier version of the Partek algorithm to create the segmentation file, the algorithm did not report the exact physical position of the first and last markers of the segments. If you run GISTIC on a segmentation file generated using the earlier version of the algorithm, the physical positions of the marker file will not agree with the start or stop positions of the segmentation file. Note that Partek also uses the control probes in the generation of the CN/segmentation.

#### What does the GISTIC MATLAB error "Matrix dimensions must agree." mean?

??? Error using ==> plus
Matrix dimensions must agree.
Error in ==> make_D_from_seg at 158
Error in ==> run_gistic_from_seg at 58
Error in ==> gp_gistic_from_seg at 177
MATLAB:dimagree

If you are running GISTIC and get the error above in your stderr.txt file, you should verify that your segmentation file and markers file are exactly matched. Only the markers from the markers file should be indicated in the segmentation file and only those markers indicated by the segments should be in the markers file.

IE seg file should be

1-4
5-6				

and markers file should be

1
2
3
4
5
6


#### How can I see the Color Scheme Legend in HeatMapViewer or HiearchicalClusteringViewer?

To see the Color Scheme Legend in either HeatMapViewer or HierarchicalClusteringViewer, select View>Color Scheme Legend. This legend also coordinates with HeatMapImage if you use the same parameters for HeatMapViewer and HeatMapImage.

Note that the issue of legends for row-normalized HeatMaps will be addressed in upcoming releases of HeatMapViewer, and HierarchicalClusteringViewer.

#### ComparativeMarkerSelectionViewer is not launching, but HeatMapViewer is. What is wrong?

You may have a corrupted copy of the ComparativeMarkerSelectionViewer directory. To fix this, do the following:

1. Make sure you have your Java Preferences set to display console. To do so open your Java Preferences (Mac) or Java Control Panel (Windows), go to the Advanced Tab and expand the Java console. Select Show Console, if it is not already selected. You will need to restart your browser for this setting to take effect.
2. Attempt to launch ComparativeMarkerSelectionViewer again.
3. Look in the console for the directory where your viewer is being downloaded and executed. Look for a line like:
downloading URL ftp://ftp.broadinstitute.org/pub/genepattern/datasets/protocols/all_aml_test.preprocessed.gct
to directory /var/folders/+8/+8VwuO5ZH1S3hKW-uC7XEk0cTCY/-Tmp-/ComparativeMarkerSelectionViewer2436495742394955925.tmp
as all_aml_test.preprocessed.gct 
4. Remove the "to" directory. So, in this example we would delete /var/folders/+8/+8VwuO5ZH1S3hKW-uC7XEk0cTCY/-Tmp-/ComparativeMarkerSelectionViewer2436495742394955925.tmp
5. Try launching the viewer again. This should download a new copy of the executable and the viewer should display correctly.

(Note that bash shells do not display "-" files and directories (like "-Tmp-") So, you may need to use a different shell to find the file path.)

#### I get "??? Attempted to access rl(:,2); index out of bounds because size(rl)=[0,1]" in my stderr file when running GISTIC, what does this mean?

If your run of GISTIC fails with the error below in the stderr.txt file, check your segmentation file format. Please see the sections on the segmentation file format in the GISTIC documentation for more details and examples.

??? Attempted to access rl(:,2); index out of bounds because size(rl)=[0,1].
Error in ==> derunlength at 25
Error in ==> smooth_cbs at 148
Error in ==> run_gistic_from_seg at 125
Error in ==> gp_gistic_from_seg at 177
MATLAB:badsubscript

#### I get "??? Index exceeds matrix dimensions." in my stderr file when running GISTIC, what does this mean?

If your run of GISTIC fails with the error below in the stderr.txt file, check your markers file format. Please see the sections on the markers file format in the GISTIC documentation for more details and examples.

??? Index exceeds matrix dimensions.
Error in ==> check_if_has_header at 13
Error in ==> make_D_from_seg at 21
Error in ==> run_gistic_from_seg at 58
Error in ==> gp_gistic_from_seg at 179
MATLAB:badsubscript

#### Does GISTIC support SNP 6.0 data?

Yes, GISTIC supports the Affymetrix Human SNP 6.0 array.

#### Why can't I run the GenePattern visualizers?

1. Do you have Java installed on your computer? If you're not sure, try Java Tester to find out.
If not, you need to install the current version of Java, as the visualizers need it to run.

There is a known issue with Java 1.6.0_37 as well as some issues with Java 7 - please see our blog for more information.

1. What version of Java do you have installed? If you're not sure how to find this on your computer, try Java Tester to find out.
The visualizers require Java 1.5 or later. We have tested them with Java 1.5 and Java 1.6. If you have an earlier version, you will need to update it. (Note that there is a bug in Java 1.6.9_16 build on Macintosh, so we recommend updating to _17 or later.)
2. Was there a pop-up window as you started the visualizer? The pop-up would have asked if you wanted to allow the applet to access your computer (on Macintosh) or if you wanted to run the application (Windows). Did you click Deny or Cancel?
If so, restart your browser, relaunch the viewer, and when the pop-up appears, click Allow (Macintosh) or Run (Windows).
3. Are you running your own server behind a firewall?
Make sure your proxy is configured correctly.
4. Set your Java Preferences to display the Java console.
1. Open your Java Preferences (Mac) or Java Control Panel (Windows).
3. Select Show console (under the Java console section), if it is not already selected.
4. Restart your browser so this setting will take effect.
5. Attempt to launch the visualizer again.
6. Look at the console output. If you see a line like this:
downloading URL ftp://ftp.broadinstitute.org/pub/genepattern/datasets/protocols/
all_aml_test.preprocessed.gct to directory /var/folders/+8/+8VwuO5ZH1S3hKW-uC7XEk0cTCY/-Tmp-/
ComparativeMarkerSelectionViewer2436495742394955925.tmp as all_aml_test.preprocessed.gct 

Then:

If you see a line like this:

Invalid or corrupt jarfile C:\Documents and Settings\Administrator\Local Settings\Temp\HierarchicalClusteringViewer\hcl-o.jar

Then:

If you see an error regarding -Xmx such as

java.lang.ClassNotFoundException: Xmx2G

Then:

1. Delete the "to" directory. In this example we would delete /var/folders/+8/+8VwuO5ZH1S3hKW-uC7XEk0cTCY/-Tmp-ComparativeMarkerSelectionViewe r2436495742394955925.tmp
2. Relaunch the visualizer; this should download a new copy of the executable and the viewer should display correctly. (Note that bash shells do not display "-" files and directories (like "-Tmp-"). So you may need to use a different shell to find the file path.)
3. Delete the folder from that location (in this example, we would delete the HierarchicalClusteringViewer folder).
4. Clear your Java cache: in Java Preferences, go to the Network tab and click Delete Files... Clear the checkbox for Trace and Log Files, but leave the Applications and Applets checkbox selected. Click OK.
7. Relaunch the visualizer.
8. In GenePattern, click the My Settings link in the top right corner.
10. Set the memory to 2G by entering
-Xmx2G

11. Click Save.
12. Relaunch the visualizer.
7. If you are not able to get the Java console to come up, check to see if you have set your browser to reject third-party cookies:
• In Firefox, go to Preferences and click the Privacy tab. Select the Accept third-party cookies checkbox.
• In IE, select Tools>Internet Options and click the Advanced tab. Select the Accept third-party cookies checkbox.

Now go back to step 4 and try the console again.

#### Why is my module taking so long to run?

Some computationally-intense modules can take a day or more to run. Some examples are FLAMEMetacluster, NMFConsensusClustering, GISTIC, and GLAD. In addition, server load can affect queuing times on the Broad public server, and this can affect the length of time a module can take to complete.

#### Why does Safari crash whenever I run a Java applet?

A recent update in Safari 5 (we observed the problem with 5.0.1 [5533.17.8]) has caused this problem for some users. The Java applets interfered with include the GenePattern visualizers and the GenePattern installer.

The solution is to:

1. Select Safari>Reset Safari...
2. Click Reset.
3. Open the Java Preferences app (Applications>Utilities>Java Preferences).
4. Click the General tab.
5. In the Java Applet Plugin section, click Restore Default.
6. Restart Safari.

#### What does a missing value error for ComparativeMarkerSelection mean?

If you receive the following errors while performing an analysis with ComparativeMarkerSelection:

Error in if (min(p) < 0 || max (p) > 1) \{: missing value where TRUE/FALSE needed Execution halted

or

ERROR: The estimated pi0 <=0. Check that you have valid p-values or use another lambda method.

then a gene in your data has insufficient variation in its expression values. Use the PreprocessDataset module with a filter that is more stringent than you have previously used on your data set before running ComparativeMarkerSelection.

#### Why does ExpressionFileCreator fail?

If ExpressionFileCreator fails on your local server, but works on the public server, you need a more recent version (version 8 or 9) of ExpressionFileCreator. Only version 7 (and earlier versions) is available in the public repository (via Pipelines & Modules>Install from repository) because versions 8 and 9, which support the updated CEL file formats, require R 2.8, and GenePattern installs with R 2.5.

You can either use ExpressionFileCreator on the GenePattern public server or install a more recent version of .

Instructions for installing ExpressionFileCreator on your local server are available at ftp://ftp.broadinstitute.org/pub/genepattern/public_module_installation/efc_install_instructions.txt.

#### What does "Could not obtain CDF environment" mean?

If ExpressionFileCreator gives you the following error when you try to convert Affymetrix CEL files to GCT format:

Error in getCdfInfo(object) :
Could not obtain CDF environment, problems encountered:
Specified environment does not contain MoGene-1_0-st-v1
Library - package mogene10stv1cdf not installed
Bioconductor - mogene10stv1cdf not available
Calls: parseCmdLine ... .local -> indexProbes -> indexProbes -> .local -> getCdfInfo
Execution halted


Then the CDF for your array was not found in the Broad-hosted CDF library. You need to use a custom CDF to support the conversion. CDF files are available here. For instance, if you were analyzing the Mouse Gene 1.0 ST Array, you could type in that search term on the Affymetrix page. The result page opens, where you could find your CDF file under the Library Files section.

Provide this CDF file as the input for the cdf file parameter in ExpressionFileCreator.

#### What does the GISTIC error, "Invalid file identifier" mean?

The usual cause of this error is spaces in any of the input file names.

#### Why does the "no such module" error occur for a module on the server?

If you run an imported pipeline on your own GenePattern server, and you get the error, "No such module [module name]", when you know you have that module on your server, then the pipeline requires a version of the module that is not on your server. If you return to the pipeline page and click Properties, you can view the modules that are required but not installed. If you install these module versions from the repository, the pipeline will run.

#### How can I properly view my GCT or RES file in IGV?

The default IGV display option for a GCT or RES file is the Heatmap. For the heatmap to make sense, the data must be row-centered, scaled and possibly have a threshold applied.

For complete information, see the in-depth article Using IGV Through GenePattern.

#### Why is nothing happening when I try to upload my large file?

There are limitations on file upload size. Files uploaded via the Browse button on the module input page must be under 1.2 GB. To use larger files, there are a few options:

• Download and install GenePattern on a local machine. Put your files on a server that is accessible to your GenePattern server – that is, on the same file system or via a network share – and use the file path as input for the GenePattern modules. (Note: you will have to enable file paths on your server.)
• Put your files on a web-accessible machine or FTP site and specify a URL or FTP address for the input file. Make sure that the machine you use is accessible to the GenePattern server.

#### What does "Error in subfiles: subscript out of bounds" mean?

This error can be produced if there are hidden files or directories in the ZIP archive. This usually occurs on a Mac when using the "Compress" option from the right-click pop-up menu. If this is the case, you may want to use the zip command from the terminal window to zip files instead. If you didn't Compress on a Mac, then you should check that there are no hidden files in the ZIP archive.

#### How can I pre-process my RNA-seq data for IGV?

The recommended format for RNA-seq data in IGV is the BAM file. If you run your SAM or BAM file as the input file for the SortSam module, you can sort and index it, and can convert a SAM file to BAM.

#### Does GenePattern support SNP 6.0?

No. The GenePattern team is presently working to support SNP 6.0.

#### Why did the module I tried fail to run with my ZIP file as input?

If your ZIP file has a directory in it, GenePattern cannot resolve it. Unfortunately, if you generated your ZIP archive using the Finder on the Macintosh OS, the Mac builds a directory structure into your ZIP archive and GenePattern cannot resolve it. To zip on a Mac, use the zip command from a terminal window; for example, if you wanted to create a ZIP archive called "all_foo" that contains the files all_foo.cls and all_foo.gct, you could use the following command:

zip all_foo all_foo.cls all_foo.gct

#### Why did my GenePattern job fail?

The first place to look for the reason is the stderr.txt file, which should be available in the job summary or job status page. This file often contains plain text indicating what went wrong with a job, such as formatting or filtering errors. If you find that this file does not help you resolve the error, please contact us at gp-help(at)broadinstitute.org.

#### How can I use the RNA-seq modules available in GenePattern?

You can use the RNA-seq modules either on the Broad public server or by installing them on a GenePattern server installed on your machine or a network-accessible server.

#### If You Choose to Run RNA-seq Modules on Your Own GenePattern Server

If you have not installed GenePattern on your local machine, instructions for installing a local GenePattern server are provided on the Download GenePattern page.

If you have already installed a GenePattern server, select Modules & Pipelines>Install from repository. The page will present all available modules. You only need to select the checkboxes for the modules you want and click Install Checked.

Note: The main analysis RNA-seq modules (Bowtie, BWA, Cufflinks, TopHat, and Scripture) currently only run on Macintosh and Linux. If you do not have access to machines with these operating systems, you can use the modules on the Broad public server. The conversion/utility modules that are related to the RNA-seq modules are available for Macintosh, Linux, and Windows.

You may find it helpful to enable your GenePattern server to accept file paths in order to handle large input files that are already present on the system where your local server is installed. To do this, edit genepattern.properties (located in the resources directory under your GenePattern server directory) and make allow.input.file.paths=true. This allows users to input a network file path (such as file:///server/directory/file.gct) as the value for an input file parameter. When this value is set to true, you can define a root directory where the GenePattern server begins browsing for network files by setting server.browse.file.system.root to the root directory you want to specify.

Example: In genepattern.properties, setting server.browse.system.root=/Users/mydata/ngs will cause the browser window to open to /Users/mydata/ngs when a user chooses Specify File Path or URL.

#### Why is my GenePattern job stuck in the PENDING state?

There are a few reasons why this might occur. Jobs are often PENDING because GenePattern is a shared resource. When your job is in the PENDING state, it means that it is waiting in the queue behind other jobs for the GenePattern server to submit the job to the server farm. Jobs that use large files and access them via an external URL may hold up the line while those files are transferred to the GenePattern server, even keeping jobs that normally take a few seconds in PENDING.

The job will run when the queue clears up.

#### Why do I receive an error when running my preprocessed GCT file and CLS file in ComparativeMarkerSelection?

If you tried to run your preprocessed GCT file and CLS file in ComparativeMarkerSelection, but it gives you the following error:

An error occurred while reading the file ClassFile.cls.

Cause: Header line needs three numbers!

#### Why did I get a warning stating that my index is older than my BAM file?

If you try to run an indexed BAM file through a module and receive a warning that your index file (BAI) is older than your BAM file, it means that the timestamps for these files are out of sync. If you receive this warning, you should index your BAM file by using the SortSam module.

#### How do I run several files through a set of modules in parallel?

You can do this by creating a pipeline for the jobs you want to run in parallel.

Then you can submit your set of data files to the pipeline as batch job. For more information, see Batch Processing.

There are additional features that make it easier to work with large input files and to run batches of jobs in parallel:

• This in-depth article discusses working with large input files: Using Large Files in GenePattern
• GenePattern also has a feature (disabled by default) that allows you to access input files on the server's file system. With this feature turned on, you don't need to directly upload your input files via the job input form. See Using File Paths for details.
• GenePattern has a programming interface (with versions for Java, R, and MATLAB) that allows you to submit your jobs in parallel. See the Programmers Guide for more details.

#### How do I zip my files for use in GenePattern?

On Windows, you need to select the files to be added to the ZIP archive (hold down the Control or Shift key while selecting to select a group). Then right-click on the group and select WinZip (or whichever zip application you have on your machine). Do not select a folder and zip it – that will create a directory inside the ZIP archive; if your ZIP archive has a directory in it, GenePattern cannot resolve it.

On Macintosh, if you generate your ZIP archive using the Finder, Mac builds a directory structure into your ZIP archive and GenePattern cannot resolve it. To zip files on a Mac, use the zip command from a terminal window (launched from Applications/Utilities); for example, if you wanted to create a ZIP archive called "all_foo" that contains the files all_foo.cls and all_foo.gct, you could use the following command:

zip all_foo all_foo.cls all_foo.gct

If you follow these instructions and find that GenePattern does not accept your ZIP file, check for spaces in the names of the files or hidden files in the ZIP archive. If you cannot locate the issue with your ZIP file, please contact us. The GenePattern team plans to develop a ZIP module to help users with creating ZIP archives.

#### How can I easily run the same analysis on many different data files?

As of GenePattern 3.3.3, GenePattern supports batch jobs. To use this feature:

2. Click the arrow next to the uploads directory, name a subdirectory, and click Create.
3. Click the arrow next to your subdirectory and select Upload. This launches the GenePattern file uploader.
4. Click Add in the top of the uploader window and select all the files you want to run as a batch.
5. Click the upload arrow. This will upload all your files into the subdirectory you just made on the GenePattern server. Do not close the uploader window while the file upload is in progress.
6. Once the files are uploaded, click the blue arrow next to the directory containing the files, and select the module or pipeline you want to use for your analysis.
7. If there is more than one input file field you need to populate, you can select "send to as batch" for those parameters that accept batch inputs. Make sure that all the files for a given analysis, which need to be paired, have the same name; for instance, "file1.gct" would be processed with "file1.cls".
8. Run the module.

The module will be run once for each file selected. All the job results for the batch will be listed under a single batch ID.

#### Why can't I use a directory as input for all modules?

While as of GenePattern 3.3.3, GenePattern supports the use of directories as input for modules, not all modules support this function.

A few quick ways to tell if a module does accept directories are:

• Click on the arrow next to a directory in the Uploads tab; the modules listed in that drop-down will accept directories as input.
• Check the caption under the input parameter; if the module accepts directories as input, it will indicate that here.
• Check the module documentation (available from the help link in the upper righthand corner of the module's page); the input parameters section will make it clear if a directory is accepted as input for the module.

#### Does ExpressionFileCreator support Exon arrays?

ExpressionFileCreator does not currently support Exon arrays. The GenePattern module development team is working on a module for this.

#### How do I format my GenePattern output for submission to GEO?

There are currently no modules in GenePattern for submitting data to GEO. The NCBI has webtools for this purpose, such as GEOarchive.

#### How do I get a heat map with a high enough resolution for publication?

To generate a new heat map image at a resolution near 300 dpi, you can:

1. Select the HeatMapImage module in GenePattern.
2. Change both column size and row size to 33 pixels.
3. For best results, change show grid to "no". (The grid does not scale as much as the column/row size does, and so may look suboptimal for print publication.)
4. Generate your heat map image. Open your heat map image file in an image manipulation application that can scale images (like Adobe Photoshop or GIMP) and increase the image resolution to 300 dpi. This will reduce the size of the image by about 4 times (thus why you enlarged the image above) and leave it at a resolution of 300 dpi, which is optimal for print publication.

If you already have a heat map image that you cannot for some reason recreate that is at 72 dpi, you can use an image manipulation application that can scale images (like Adobe Photoshop or GIMP) to increase the resolution to 180 dpi. This will shrink the image by half, but 180 dpi is usually the minimum resolution necessary for print publication.

#### I am running a large number of RNA sequencing jobs, and I'd like to be able to look at the quality of the data. Is there a tool I could use for this?

Yes: the RNAseQC module in GenePattern calculates standard RNA-seq related metrics, including depth of coverage, ribosomal RNA contamination, continuity of coverage and GC bias. See the module documentation for the recommended data processing workflow for optimal use of this QC analysis.

#### Is there an easier way to create a CLS file than creating it by hand in a text editor?

Try the ClsFileCreator module in GenePattern. The ClsFileCreator is a wizard-based tool that can be used to create class label (CLS) files from array data in the GCT or RES file formats.

#### How can I install GISTIC_2.0 on my own GenePattern server?

To install GISTIC follow the instructions below: Note that you must install on a 64-bit linux machine.

To install GISTIC on your 64-bit Linux machine, export it from the public GenePattern Server.
Select GISTIC from the list of modules. Click the export link to save the module in a zip file.

If you do not already have MATLAB installed, you will need to do so. An executable and instructions can be found on the GISTIC 2.0 publication page

Once MATLAB is installed, you will need to add lines like the following to your <GenePatternServer>/resources/genepattern.properties or custom.properties file:

and

APPLERES_DIR=/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/X11/app-defaults

Then restart your GenePattern Server. (To avoid restarting - enter these via the Administration>Custom page)

Then connect to your GenePattern Server, and import GISTIC from the zip file you just exported.

If you then wish to run it from command line. Please use one of the programming interfaces, as described in the Programmers Guide.

#### How can I retrieve external database information from GenePattern?

The GenePattern server itself does not connect to any database, but modules can and have been written to connect to databases and retrieve data from them including caArray (caArrayImportViewer) and Gene Expression Omnibus (GEOImporter). To connect to any database of your choice, write a simple command-line program to connect to the database and retrieve data into a file format and install this program as a module into GenePattern (see Creating Modules).

#### My MATLAB figures are not appearing in the MATLAB visualizer I created. Why?

When creating a matlab visualizer using matlab 7.0 compiled m-code (any release before 7.4), any figures that you create in MATLAB must have the value visible set to on or they will not be drawn to the screen.

#### Can my module use a different version of R than GenePattern?

GenePattern modules can be written for any version of R. For details on how to specify which version to use, see Using Different Versions of R.

#### How can I share a GenePattern module?

You can submit your module to the GenePattern Archive (GParc). You will need to register to use all the features of GParc. Check the resources for module developers for best practices in testing and documentation before submitting your module.

For an introduction to GParc, see the Submitting your module to GParc video tutorial.

#### Why can't I specify a 64-bit platform for a module?

GenePattern does not have a valid CPU Type for 64-bit platforms. So if you try to specify a 64-bit CPU Type, the module will fail on 64-bit platforms, whether or not they are running compatibility mode. You will have to set the CPU Type to 'any' and add more information on the appropriate platforms in your documentation. If this does not stop the module from failing on appropriate platforms, contact us at gp-help(at)broadinstitute.org.

#### Where can I find out more about how to launch GenePattern modules from other programming languages?

The reference guide for accessing GenePattern modules from Java, MATLAB, and R is the Programmers Guide.

#### Can I use the GenePattern APIs to create a web service that programmatically accesses the GenePattern server?

GenePattern is based on a web services API already so you may not need to create a new web service for this purpose. The WSDL for the GenePattern server is available at

http://your_server:your_port/gp/services

To get an easy start on creating a web services client to GenePattern:

1. Start GenePattern (http://localhost:8080/gp is usually the URL).
2. Create a one-step pipeline (see Creating Pipelines).
3. Select the pipeline that you just created. GenePattern displays the parameters (if any) for the pipeline.
4. At the bottom of that page, select your favorite programming language and click View Code. GenePattern generates the code required to run the pipeline.
6. Compile and run the generated code.

You can then modify the pipeline code to do what your application needs.

#### Why can't I call my pipeline/module from MATLAB?

A pipeline or module with a period in its name cannot be called from MATLAB.

#### What is a CSV file?

CSV stands for "comma-separated values". While CSV files will open in Excel or similar spreadsheet applications, it is important to remember that the values in these files are comma-delimited, not space- or tab-delimited.

#### Can I process raw Illumina BeadChip data in GenePattern?

There are several modules available in updated form on the Broad GenePattern server for the processing of raw Illumina scan data into GCT files that are usable by GenePattern: IlluminaScanExtractor, IlluminaNormalizer, and IlluminaConcatenator, only support the 6k Transcriptionally Informative Gene (TIG) panel (GEO accession: GPL5474), but not other DASL gene panels at this time. The IlluminaDASLPipeline is a workflow that chains together these 3 modules so that it is easy to process zipped Illumina scan data files produced by a DNA-mediated Annealing, Selection, extension and Ligation (DASL) assay.

IlluminaExpressionFileCreator extracts the mean value for each probe from a set of Illumina expression IDAT files and put them into GCT format.

#### What versions of genomic databases is GeneCruiser currently using?

UniGene and SwissProt are at the current versions listed on their websites and are updating regularly. We are working on restoring regular updates for Entrez Gene. If you are interested in knowing the version of another of the databases accessed by GeneCruiser, please contact us.

#### Can you send me the source code for GISTIC?

We do not currently distribute the source code for GISTIC. The executable is available and can be found on the GISTIC page. You can also export the GISTIC module from the Broad's public GenePattern server. Note that the GISTIC module and executable are currently compiled only for 64-bit Linux.

The GISTIC developers are working on a version that will allow us to distribute the source code, but it is still currently in development.

#### Does GISTIC support Agilent data?

Yes, GISTIC will support Agilent data. However, you must convert your aCGH data into SEG (segmented) format. GenePattern does not currently provide a module for converting Agilent data to SEG format.

#### Why am I getting the "none of the gene sets passed the size thresholds" error in GSEA?

There are several points you need to check in your gene sets. Check that your gene identifiers are all uppercase if you are not using the collapse to gene symbols option. For other information, please see the error 1001 FAQ for GSEA for the list.

#### I keep getting file errors when I run a module. What are common reasons for file errors that I can check?

There are several things you can check in your files that commonly cause file errors:

• Do your files/directories have spaces in their names?
• Remove them or replace them with _ (underscore) or periods.
• Are there characters such as parentheses or pound signs (#) in your file names?
• Remove them or replace them with _ (underscore) or periods.
• What type of file is the module expecting? Is your file the correct type?
• Does the file have the correct extension for the file type the module is expecting?
• Sometimes Excel or similar programs can add a ".txt" or other extension to the file name. Remove it (rename the file on your desktop and delete the .txt extension) and make sure the file name ends with the correct extension.
• Is your data delimited in the way the module expects it to be?
• Check the module documentation to see if it expects tab-, comma-, or space-delimited data (or something else), then make sure your file is formatted appropriately.
• Did you edit your file in Excel or similar program?
• Such applications can sometimes add extra spaces or tabs. Open your file in a text-editing application and look for these extra invisible characters that can cause errors.
• Do the contents of your file match the expected file format?
• Check that your file contains all the expected columns and header information in the expected order for the given format. See File Formats.

Where are the dock icons for my GenePattern server on my Mountain Lion (OS X 10.8) machine?

When you install GenePattern 3.4.0 or earlier and select the option to install icons in the dock during install, the icons will not appear in the dock. They only appear there when the server is running. You can, however, manually place them there.

Why am I seeing a "Process existed with status code: 138" error when I try to run ConsensusClustering?

The ConsensusClustering module does not work with Java 1.6.0_33 on Macintosh.  As a workaround, you can run ConsensusClustering on the GenePattern public server, or on a server that is on a Windows machine or a Macintosh with a Java version other than 1.6.0_33.

Why can't I install a licensed module on my GenePattern server?

Licensed modules can only be installed on servers running GenePattern 3.5 or higher. Upgrade your GenePattern server and try again.

I got this error when running GISTIC_2.0 "All input data were removed after NaN processing", what does it mean?

GISTIC expects that the segments for a sample should cover almost all of its genome, even the regions where the copy number is normal. Any gaps in coverage for any sample are removed from the GISTIC analysis.

How do I find reference genomes to use in TopHat, Bowtie, or BWA?

The TopHat, Bowtie, and BWA GenePattern modules provide easy access to the reference genome index bundles for a number of species.  If we aren't yet hosting the index for the species you need, you can email us at gp-help@broadinstitute.org and we will add your species to the available indexes, or you can find additional reference genome bundles for other species are available from the Illumina iGenomes website.  Note that the GenePattern modules cannot use the iGenomes bundles directly as packaged there.  It will be necessary for you to unpack the bundle and repackage the pertinent files (for example, the Bowtie2 Index files) as a ZIP archive.  Remember that there are some special considerations for creating ZIP archives for use in GenePattern.

How do I find reference genome annotation files or whole genome files to use with the GenePattern RNA-seq tools?

Several of the modules also accept reference genome annotation files (GTF files) and/or whole genome FASTA files.  A list of these are available from our FTP site in the following locations:

The modules can usually accept an FTP URL directly wherever a file input is allowed, so there is no need for you to download the reference file; instead, just copy and paste the file's FTP URL into the file input parameter.