




GenePattern provides access to a broad array of computational methods used to analyze genomic data. Its extendable architecture makes it easy for computational biologists to add analysis and visualization modules, which ensures that GenePattern users have access to new computational methods on a regular basis.
If you are new to GenePattern, begin with the basics:
This User Guide contains the following sections:
Describes how to start and exit from GenePattern. It provides an overview of the user interface, including the navigation bar and pop-up menus. |
|
Describes how to run GenePattern analyses and how to check job status. |
|
Describes how to display, save, and delete analysis results. |
|
Describes how to install, create, edit, and delete modules. |
|
Describes how to install, create, edit, and delete pipelines. |
|
Describes how to install, create, edit, and delete suites. |
|
Provides detailed instructions for installing and deleting modules, pipelines, and suites. The previous sections summarize these actions and provide links to the detailed instructions provided in this section. |
|
Provides information for the GenePattern server administrator. |
This section describes how to start and exit from GenePattern. It provides an overview of the user interface, including the navigation bar and pop-up menus.
Note: If you are using a local server, you must first start the server as described in Starting a GenePattern Server. If you are using a networked server and you are the server administrator, you must start the server as described in Starting a GenePattern Server.
To start GenePattern:
Server |
URL |
Broad-hosted server |
|
Local server |
|
Networked server |
The URL for the networked server, for example: http://mycompany.com:8080/gp/ |
GenePattern prompts you to login:

Whether a GenePattern server requires passwords depends on how it is configured. The Broad-hosted server requires passwords. By default, a local server does not.
Note: If web browser cannot connect to the server, it displays a message such as “Unable to connect” or “Cannot display the webpage.”
You must install the GenePattern server before you can start it. To install a GenePattern server, follow the instructions provided on the Download GenePattern page. You use the same installation regardless of whether you are installing a local GenePattern server for personal use or installing a networked server for use by an institution. The difference is in how you configure the server. If you are installing a local GenePattern server, use the default server settings. If you are installing a networked GenePattern, consider reconfiguring the GenePattern server as discussed in Managing the GenePattern Server.
To start the GenePattern server, double-click the Start GenePattern Server icon (shown below). By default, installing GenePattern places this icon on your desktop.
![]()
Windows: While the server is starting, the cursor displays as an hourglass. The server is ready when the cursor returns to normal. For Windows 7, you must run this application as an adminstrator: to start the GenePattern server, right-click on StartGenePatternServer.exe and select Run as administrator.
Mac OS X: While the server is starting, the server icon bounces in the Dock. The server is ready when the icon stops bouncing.
Linux: The server starts silently.
When the server has started, open the web interface to your GenePattern server by clicking the GenePatternHome.html shortcut icon. By default,installing GenePattern places this icon on your desktop. If you did not install icons in your task bar or on your desktop, GenePatternHome.html can be found at the top level of your GenePattern install directory (for example, in C:\GenepatternServer\GenePatternHome.html or /Users/JDoe/Applications/GenePatternServer/GenePatternHome.html).
When first started, GenePattern displays the home page. To return to this page at any time, click the GenePattern icon in the title bar or the Modules & Pipelines item in the navigation bar.

Note: The URL of the web browser points to the GenePattern server that you are using. The modules, pipelines and suites displayed in the browser are those installed on the server. When you run a module or pipeline in GenePattern, it runs the analysis on the server and stores the analysis result files on the server.
The navigation bar provides access to GenePattern pages and operations. Click a link in the table to go to the section of this guide that describes that operation.
Modules & Pipelines |
Display the GenePattern home page. |
Create a pipeline. |
|
Create a module. |
|
Install a module or pipeline from the Broad repository. |
|
Install a module or pipeline from a zip file. |
|
Display installed modules or pipelines; delete modules or pipelines. |
|
Suites |
Display the Manage Suites page. |
Create a suite. |
|
Install a suite from the Broad repository. |
|
Install a suite from a zip file. |
|
Display installed suites; delete suites. |
|
Job Results |
Display the Results Summary page. |
Display jobs run on the server; delete jobs. |
|
Resources |
Display an overview of the resources. |
Mailing List |
Display the form you use to join a low-traffic GenePattern mailing list. |
Report Bugs |
Display the form you use to contact the GenePattern team to report bugs, provide feedback, or ask questions. |
User Forum |
Display information about the GenePattern user forum. |
Contact Us |
Display a form, which you can use to send questions and comments to the GenePattern team. |
Downloads |
Display an overview of the available downloads. |
Programming Libraries |
Download and install GenePattern libraries for use with Java, MATLAB, or R. |
Public Datasets |
Download sample datasets for use with GenePattern. |
Administration |
Display the Server Settings page. |
Modify settings that affect the GenePattern server. |
|
Help |
Display the GenePattern home page. |
Tutorial |
Display the Tutorial, which provides a hands-on tour of GenePattern. |
Concepts |
Display the Concepts Guide, which provides a brief introduction to GenePattern. Other GenePattern documentation assumes familiarity with these concepts. |
User Guide |
Display this guide, which describes how to use GenePattern. |
Programmer Guide |
Display the Programmer Guide, which provides guidelines for writing modules and instructions for accessing GenePattern from the Java, MATLAB, and R programming environments. |
Module Documentation |
Display a list of the modules and pipelines installed on your server, with brief descriptions and links to the module/pipeline documentation. |
File Formats |
Display the File Formats Guide, which describes all file formats and provides instructions for creating input files. |
Release Notes |
Display the Release Notes, which describes new features and known issues in this release. |
FAQ |
Display the GenePattern list of Frequently Asked Questions. |
About |
Display the release date and build number of the GenePattern server. |
When GenePattern
displays an analysis job and its results, click the
icon next to the job name to display a menu of commands for working with that
job. For more information, see Working with Analysis Results.
Download |
Download a zip file containing all analysis result files for this job. |
Terminate |
Stop the job. This menu item appears only while the job is running. |
Reload |
Display the analysis and its parameters in the center pane, with the parameters set to the values used for this analysis job. |
Delete |
Delete the analysis job and its analysis result files from the GenePattern server. |
Info |
Display the parameter values and the analysis result files for this job. |
View Java Code |
Display the command line that you would use to run this job in the Java, MATLAB, or R programming environments. These commands are useful for programmers who want to access GenePattern from one of these programming environments or from their own applications. |
When GenePattern
displays an analysis job and its results, click the
icon next to the file name to display a
menu of commands for working with that file. For more information, see Working
with Analysis Results.
Delete |
Delete the file from the GenePattern server. |
Save |
Download the file from the GenePattern server. |
Create Pipeline |
Create a GenePattern pipeline that includes the modules and parameters necessary to reproduce this result file. |
List of modules |
List modules that commonly use this type of file as an input parameter. Select an analysis to display its parameters in the center pane, with this result file specified as the first input parameter. |
When GenePattern
displays a suite, click the
icon next to the suite name to display a
menu of commands for working with that suite. For more information, see Working
with Suites.
Edit |
Available only for suites that you have created. Display the Edit GenePattern Suite page, which you can use to modify your suite. |
Delete |
Delete the suite from the GenePattern server. |
Export excluding dependents |
Create a zip file that contains the definition of the suite, but not the modules or pipelines in the suite. The zip file can be used to install the suite on another GenePattern server (see Exporting and Installing Suites Using Zip Files). Installing the suite from this zip file will not install any modules or pipelines in the suite; they must already be installed on the GenePattern server or be installed separately. |
Export including dependents |
Create a zip file that contains the definition of the suite, as well as the modules and/or pipelines in the suite. The zip file can be used to install the suite on another GenePattern server (see Exporting and Installing Suites Using Zip Files). Installing the suite from this zip file will also install the modules and pipelines in the suite (unless they are already installed on the GenePattern server). |
Use My Settings to change your GenePattern account information:

To exit from GenePattern, click Sign Out in the top right corner of the title bar.
To shutdown a GenePattern server, double-click the Stop GenePattern Server icon, shown below or close the console window.
![]()
Windows: When you shutdown the server, the GenePattern console window closes.
Mac OS X: When you shutdown the server, the GenePattern server icon disappears from the Dock.
Linux: The GenePattern server exits silently.
The GenePattern web site provides an overview of GenePattern and its analysis modules, as well as links to the GenePattern software and documentation. The documentation is your primary source for help with GenePattern:
An analysis module runs a single analysis. A pipeline runs a series of analysis modules. If you are unfamiliar with GenePattern modules and pipelines, see the Concepts Guide.
To run a module or pipeline:
When you select a module or pipeline, GenePattern displays its parameters:

Most modules require one or more input files. There are several ways to choose an input file:
|
|
Specify other parameter values using the drop-down lists and entry fields:
|
|
| Hide/show the brief descriptions below each parameter. | |
| Version of the module. If multiple versions of
the module are installed on the server, GenePattern displays the latest version by default. Click the |
|
|
|
| Display the code (Java, MATLAB, or R) used to run the module with the parameters that you have entered. This can be useful for programmers writing batch procedures or new modules. |
To rerun an analysis:
When you run a module or pipeline, GenePattern runs the analysis job on the GenePattern server. Analysis results are stored on the GenePattern server for a period of time (by default, one week) and then deleted. If you are unfamiliar with how GenePattern runs modules and pipelines, see the Concepts Guide.
The following table summarizes ways to work with analysis results:
Display analysis results |
Click a job ID number to display the Job Status page, which lists the input parameters and analysis results for that analysis job. (Recent jobs are listed on the GenePattern home page. To display all jobs, click Job Results>Results Summary.) |
By default, analysis results are private. To share results with other GenePattern users, click the Edit Sharing icon on the Job Status page. |
|
Save analysis results |
To save results persistently (beyond the period of time they are stored on the server), download the analysis result files to a more permanent location:
|
Delete analysis results |
If you no longer need your analysis results, you can delete the files from the server:
|
When you run a module or pipeline, the files generated by the module/pipeline are stored on the GenePattern server. The module author determines the content and format of the generated files; however, by convention, each module generates the following files:
When you run a module or pipeline, GenePattern sends the analysis job to the server and displays the Job Status page. This page displays complete information for an analysis job, including its status, input files, parameter values, and (when the job completes) result files. After starting an analysis, you can continue working. You do not have to leave the Job Status page displayed.
GenePattern offers several ways to redisplay a Job Status page:
The following figure shows the Job Status page for a pipeline job. The Job Status page for a module job is similar.

| Click the |
|
| Icons indicate whether this is an |
|
| Click the |
|
| Show/hide execution log files. | |
| Click the An |
|
| Click the |
|
| Icons indicate whether the job is |
|
| For a pipeline, each section of the colored line beneath the job name represents a step in the pipeline. As each step completes, its section of the line changes from green to blue. |
An Email Reminder check box is visible while the job is running. For long running jobs, select the check box to have GenePattern send you email when the job completes. Continue working in GenePattern or exit from GenePattern. When you receive the email indicating that the job is finished, display the Job Status page to review the analysis results.
The GenePattern home page lists your most recent jobs. The Job Results Summary page lists all of your analysis jobs.
To display the Job Results Summary page, click Job Results>Results Summary.

To sort the job results, click a column header. You can sort jobs by status, job ID, module name, submit date, or completion date. Within jobs, you can sort files by file size or file output date.
![]() |
Filter the display:
|
![]() |
Show/hide the execution log files. |
|
Icons indicate whether the job is |
|
Click the job ID to display the Job Status page. |
![]() |
Delete jobs and/or files: (1) select the check boxes of the jobs and/or files to delete and (2) click the delete link in the column header to delete them. Selecting a job selects all of its files. Selecting the check box in the column header selects/clears all check boxes. |
|
Name of the module that was run and the name of each result file. Click the arrow next to the Module Name header to hide/show all result files. Click the arrow next to a module name to hide/show its result files. |
|
Size of each file and the total job size (combined file size). |
|
Time the job was submitted. |
|
Time the job was completed and time each file was last saved. |
|
Name of the person who ran the job. Or, more precisely, the GenePattern user name of the account that ran the job. |
|
Your access to the job. You have read, write access to jobs that you have run. You have either read or read, write access to shared jobs. Write access gives you permission to delete a job or any of its result files. |
|
Share status: |
When you run an analysis job, by default, it is private: only you (and GenePattern administrators) can view or delete the job. Sharing job results gives other GenePattern users access to the job, including its input files, parameter values, and result files.
To share job results or modify the share status of a job:

Sharing input files: In GenePattern, you can specify the output file from one analysis as the input file for a subsequent analysis. For example, you might use the output file from PreprocessDataset as the input file for ComparativeMarkerSelection. In this case, if you share the ComparativeMarkerSelection job, the other user can view the result files but cannot view the input file (which is from the PreprocessDataset job) or rerun the job. To share the ComparativeMarkerSelection job and its input file, either (1) share both the ComparativeMarkerSelection and PreprocessDataset jobs or (2) save the output file from PreprocessDataset, rerun ComparativeMarkerSelection using the saved file, and share the resulting ComparativeMarkerSelection job.
Creating groups: To create a group or add members to a group, contact the GenePattern administrator. If you are an administrator, see Creating Groups and Administrators for more information.
Analysis and visualization modules are at the heart of GenePattern. Analysis modules provide computational methods and tools for gene expression analysis, proteomics data analysis, SNP analysis, and data preprocessing and conversion. Visualization modules display your data and analysis results graphically. If you are unfamiliar with GenePattern modules and pipelines, see the Concepts Guide.
The following table summarizes the different ways you can work with GenePattern modules.
Run a module |
Select a module, enter its parameters and click Run. For more information, see Running Modules and Pipelines. |
A module’s definition includes the author, the command line used to invoke module, and the programs used to execute module. To display a module’s definition, click Modules & Pipelines and select the module. When GenePattern displays the module parameters, click Properties. |
|
Send module to other users |
Zip files provide a convenient way to send modules to other GenePattern users:
For more information, see Exporting and Installing Modules & Pipelines Using Zip Files. |
Install modules from the repository |
The Broad Institute maintains a repository of modules, pipelines, and suites. To install modules from the Broad repository, click Modules & Pipelines>Install from Repository. For more information, see Installing Modules & Pipelines from the Repository. |
Create modules |
An analysis module invokes a program that executes the desired function. To create a module, you must write the program that implements the analysis and then create the GenePattern module that invokes that program. For more information, see Creating Modules. |
Edit modules |
You can edit a module that you have created or copy a public module and edit your copy of the public module. For more information, see Editing Modules. |
Delete modules |
To delete a module from your GenePattern server, click Modules & Pipelines>Manage. For more information, see Managing Modules & Pipelines. |
To display the definition of a module:

The module definition is a read-only version of the page used to create and edit the module. The field descriptions below are presented in two parts: a brief description of the field, which is generally sufficient if you are viewing the module definition, and additional details, which are necessary if you are creating or editing the module definition.
The name of the module is used to identify the module in the user interface of the GenePattern clients. The name should be a short but descriptive, without spaces or punctuation, and may include both upper- and lower-case characters.
Example: ConsensusClustering
You cannot create or edit LSIDs. The GenePattern server automatically assigns an LSID to each version of a module. If you are unfamiliar with GenePattern versioning, see the Concepts Guide.
GenePattern displays the description, sometimes in abridged form, in forms and drop-down lists in the clients and in generated code when creating scripts from pipelines. The description should be a sentence or short paragraph that documents succinctly what your module does and why someone would want to use it.
Example from ConsensusClustering: Resampling-based clustering method
GenePattern displays the author name as part of the module definition. This is a comment-only field. If you make this module public, the author field allows other users to credit the author when citing the module and to contact the author with questions, suggestions, or enhancement ideas.
Example from ConsensusClustering: Stefano Monti, Broad Institute
When you create a module, by default, it is marked private. When you are ready for others to use your module, change the privacy to public.
When you create a module, by default, its quality level is development. Although these terms have no strict definitions, they are useful for setting user expectations. If you make this module public, set the quality level appropriately.
When you create/update a module, the command line is critical and must be platform-independent. You define the command line as a combination of fixed and dynamic text. GenePattern resolves the dynamic text to build the command line that executes the module. For more information, see Defining the Module Command Line.
When you create/update a module, you can choose an existing category name or create a new category name. If your module fits into an existing category, such as Preprocess & Utilities, select that category from the drop-down list; otherwise, click the New button to add a new category. GenePattern creates the drop-down list of categories dynamically based on the categories of the modules installed on your GenePattern server. If you delete the last module in a given category, that category is removed from the drop-down list.
When you create/update a module, if your code is compiled for a specific platform (Intel, Alpha, PowerPC, etc.), select that platform from the drop-down list. GenePattern enforces CPU requirements when it runs the module.
When you create/update a module, if your code requires a specific operation system (Windows, Linux, MacOS, etc.), select that operating system from the drop-down list. GenePattern enforces operating system requirements when it runs the module.
GenePattern does not enforce programming language requirements. However, including the language version information in the module definition gives prospective users a hint concerning system requirements.
When you update a module, briefly describe the changes that you have made. When GenePattern clients display a drop-down list of versions, the comments for each version are visible in the drop-down list.
When you create/update a module, use this field to select all file formats used by the module. To select multiple file formats from the list, use ctrl-click or shift-click. If the module uses a file format not included in the list, click the New button to add that file format.
When you create/update a module, you must specify all files used in your module, including scripts, libraries, property files, DLLs, executable programs, documentation, and so on. For more information, see Adding Module Support Files.
When you create/update a module, you must define each parameter in the module command line. For more information, see Defining the Module Parameters.

Defining the command line that launches your module is a critical piece of defining your GenePattern module. The text of the command line must be variable to address different parameter values, CPU platforms, and operating systems. To ensure that your module runs under different conditions, the command line that you enter will be a combination of fixed and variable text. You specify the variable text in the form of substitution variables enclosed in angle brackets. GenePattern replaces the substitution variables with their assigned values before invoking the command.
Use substitution variables to reference:
The following table lists the substitution variables for the most commonly used server configuration properties:
<java> |
path to Java; usually the one running the GenePattern server |
<perl> |
path to Perl, installed with GenePattern server on Windows, otherwise the one already installed on your system |
<R> |
path to a program that runs R and takes as input a script of R commands. R is installed with the GenePattern server on Windows and MacOS |
<java_flags> |
Java virtual machine configuration parameters (such as VM memory settings) from the Server Settings page (Java Flag settings) |
<libdir> |
directory where the module's support files are stored |
<job_id> |
job number |
<name> |
name of the module being run |
<filename_basename> |
for each input file parameter, the filename without the file extension or directory Note: In the property name, filename is the name of the input file parameter. For example, if you have an input file parameter named input.filename, the substitution property name is <input.filename_basename>. The next two properties are similar. |
<filename_extension> |
for each input file parameter, the extension without the filename or directory |
<filename_file> |
for each input file parameter, the input filename without the directory |
<path.separator> |
Java classpath delimiters (: or ;), useful for specifying a classpath for Java-based modules |
<file.separator> |
/ or \ for directory delimiter |
<line.separator> |
newline, carriage return, or both for line endings |
<user.dir> |
current directory where the job is executing |
<user.home> |
user's home directory |
<parameter_name> |
value of the named parameter; for example, if you have a parameter named arg1, use the substitution property <arg1> to include the value of that parameter on the command line |
The following example uses substitution variables for two configuration properties, <java> and <libdir>, and one module parameter, <arg1>:
<java> -cp <libdir>mymodule.jar com.foo.MyModule <arg1>
To execute the module, GenePattern locates the Java runtime and asks to execute the MyModule class using code from the module support file mymodule.jar. The value of the arg1 parameter is passed as an argument to MyModule.
Note: if your module is designed to accept a standard input stream and/or write to a standard output stream, you can use redirection syntax when describing the command line. To redirect a file to the input stream, enter the text \< followed by the input file parameter. To redirect the standard output or standard error streams to a named file, enter the text \> or \\>& followed by the name of the output file. In the following example, the LogTransform module reads its input from the standard input stream and writes its output to the standard output stream:
<perl> <libdir>log_transform.pl \< <input.filename> \> <output.file>
When you create/update a module, you must define each module parameter. GenePattern uses the definition that you supply to prompt users for input when they run your module. Use the Parameters section of the Create/Update module form to enter each parameter. Every parameter entered in this section must appear in the command line, unless you mark the parameter as optional. The following example shows the Parameters section for the Create/Update module form for ExtractComparativeMarkerResults:

Following are descriptions of each field. The first line of the Parameters section provides an example parameter.
Choices |
Associated command-line component |
Hierarchical clustering |
hierarchical |
Self-organizing map |
SOM |
Non-negative Matrix Factorization |
NMF |
nothing specified |
|
pi |
3.14159265 |
When you create/update a module, you must specify all files used in the module, including scripts, libraries, property files, DLLs, executable programs, documentation, and so on. All files are copies to the GenePattern server and may be referenced in the command line field using the syntax: <libdir>filename. You can specify as many files as needed, provided you have the space available on your GenePattern server.
To specify a file, use the Support files field:
To remove files that you have already added, use the Current files field:
Public modules should always include documentation that provides instructions for using the module, a detailed description of each input parameter, a detailed description of each output file (both its format and content), and explain the algorithm or reference the paper, journal, or book that explains it.
The documentation that you provide with your module is automatically available to GenePattern users. As a GenePattern user, when you select a module, GenePattern displays a form that includes the module parameters and a Help button. When you click the Help button, GenePattern examines the list of support files for the module and displays the first file that has a standard documentation file extension. If no documentation file was provided, GenePattern displays a message indicating that no information is available. (By default, the standard documentation file extensions are html, htm, xhtml, pdf, rtf, and txt. You can modify this list of extensions by editing the files.doc property in the GenePattern /resources/genepattern.properties file.)
Creating a GenePattern module is a two-step process:
To create a module that invokes the program that you have written (or otherwise obtained):

Malicious code: By adding a module, a user can execute arbitrary code on the GenePattern server. Because arbitrary code may include malicious code, take precautions to protect your server: for example, employ virus scanner software and restrict access to appropriately privileged (non-root) users. For more information about securing your server, see Securing the Server.
Following is a brief tutorial that creates a module named log_transform. The program invoked by this module is a Perl script, gp_tutorial_files/log_transform/log_transform.pl, which log-transforms all positive values in a data set and sets all negative or zero values to zero. This Perl script is part of the GenePattern tutorial data set, which is downloaded during a full installation of GenePattern and is also available on the GenePattern web site.
To create the log_transform module:
<perl> <libdir>log_transform.pl -F <input.filename> -o <output.file>
Typically, you enter the command line as a combination of fixed text and variables defined by GenePattern. This allows the command line to be independent of the operating environment and allows different values to be specified at different invocations of the command. This command line uses the following variables:
In the first row of the Parameters field, enter the following information for input.filename:
In the second row, enter the following information for output.file:


To edit a module:
A GenePattern pipeline defines a sequential series of modules to be run. Modules run from a pipeline work exactly the same as those run directly from GenePattern. If you are unfamiliar with GenePattern pipelines, see the Concepts Guide.
The following table summarizes the different ways you can work with GenePattern pipelines.
Run a pipeline |
Select a pipeline, enter its parameters and click Run. For more information, see Running Modules and Pipelines. |
A pipeline’s definition lists the pipeline’s author, the modules to be run and their parameters. To display the pipeline definition, click Modules & Pipelines and select the pipeline. When GenePattern displays the pipeline parameters, click Properties. |
|
Send pipelines to other users |
Zip files provide a convenient way to send pipelines to other GenePattern users.
For more information, see Exporting and Installing Modules & Pipelines Using Zip Files. |
Install pipelines from the repository |
The Broad Institute maintains a repository of modules, pipelines, and suites. To install pipelines from the Broad repository, click Modules & Pipelines>Install from Repository. For more information, see Installing Modules & Pipelines from the Repository. |
Create pipelines |
You can create an empty pipeline and add modules to it, or you can start with an analysis result file and have GenePattern create a pipeline that recreates that analysis result file. For more information, see Creating Pipelines. |
Edit pipelines |
You can edit a pipeline that you have created or clone a public pipeline and edit your copy of the public pipeline. For more information, see Editing Pipelines. |
Delete pipelines |
To delete a pipeline, click Modules & Pipelines>Manage. For more information, see Managing Modules & Pipelines. |
To display the definition of a pipeline:
On this page, you can:
You can create a pipeline in several ways: from an analysis result file, from an existing pipeline, or from scratch (beginning with an empty pipeline).
To create a pipeline from an analysis result file:
To create a new copy of an existing pipeline:
To create a pipeline from scratch:
To edit a pipeline:
GenePattern displays the pipeline designer form when you create or edit a pipeline. Unless you are creating a pipeline from scratch, the pipeline designer form is already populated:

When you create a pipeline from scratch, the form is initially empty:

The remaining topics in this section describe how to use the pipeline designer form to edit the pipeline definition:
The fields at the top of the pipeline definition form describe the pipeline:

The documentation that you provide with your pipeline is automatically available to GenePattern users. As a GenePattern user, when you select a pipeline, GenePattern displays a form that includes the pipeline parameters and a Help button. When you click the Help button, GenePattern examines the list of support files for the module and displays the first file that has a standard documentation file extension. If no documentation file was provided, GenePattern displays a message indicating that no information is available. (By default, the standard documentation file extensions are html, htm, xhtml, pdf, rtf, and txt. You can modify this list of extensions by editing the files.doc property in the GenePattern /resources/genepattern.properties file.)
To add a module:



The pipeline definition form has no mechanism for adding a module before an existing module; therefore, to add a module at the beginning of the pipeline: add the new_first module below the old_first module, delete the old_first module, and (if necessary) recreate the old_first module below the new_first module.
For most parameters, you enter a value, select a value from a drop-down list, or use the default value supplied by GenePattern.
For input file parameters, you can select a file or use an output file from a previous module:
The drop-down list of output files lists ordinal numbers (1st, 2nd, 3rd, 4th), which allows you to select the output file based on the order in which it was generarted, and may also lists data types (for example, gct or cls), which allows you to select the first output file of the selected type. Note that an output file of type dataset indicates an odf file of type dataset, not a gct or res file.
Prompt for value: Rather than specifying a parameter, you can have GenePattern prompt the user for a value when the pipeline is run:

To remove a module, click the delete button for that module definition:

The pipeline definition form has no mechanism for reordering module. You must delete the module that you want to move and recreate it in its new position.
When you add a module to a pipeline, you specify the parameter values for that module. Optionally, you can have GenePattern prompt the user for one or more parameter values when the pipeline is run. As described in Adding Modules, you select the Prompt when run box to the left of a parameter and, optionally, click set prompt when run display settings to modify the description of the parameter. GenePattern prompts the user for these parameter values each time the pipeline is run.
Occasionally, a pipeline requires that the same input file be specified for multiple parameters. For example, consider the following scenario:
Ideally, you want to prompt the user for an expression dataset file and use that file for both the input filename (ComparativeMarkerSelection) and dataset filename (ExtractComparativeMarkerSelection) parameters. To do that, add the ConvertLineEndings module to the start of your pipeline:
Use suites to group modules and pipelines into packages that have related functionality; for example, you might create a suite that contains the modules that you most commonly use to analyze new data files. Suites help you to organize and work with modules and pipelines. If you are unfamiliar with GenePattern suites, see the Concepts Guide.
The following table summarizes the different ways you can work with GenePattern suites.
To display the suite definition:
|
|
Send suites to other users |
Zip files provide a convenient way to send suites to other GenePattern users.
For more information, see Exporting and Installing Suites Using Zip Files. |
Install suites from the repository |
The Broad Institute maintains a repository of modules, pipelines, and suites. To install suites from the Broad repository, click Suites>Install from Repository. For more information, see Installing Suites from the Repository. |
Create suites |
To create a suite, click Suites>New. For more information, see Creating Suites. |
Edit suites |
You can edit a suite that you have created or copy a public suite and edit your copy of the public suite. For more information, see Editing Suites. |
Delete suites |
To delete a suite from your GenePattern server, click Suites>Manage. For more information, see Managing Suites. |
To display the definition of a suite:

From this page, you can:
To create a suite:

To view or edit a suite:
An analysis module runs a single analysis. A pipeline runs a series of analysis modules. Suites group modules and pipelines into packages that have related functionality, which helps you to organize and work with modules and pipelines. If you are unfamiliar with GenePattern modules, pipelines and suites, see the Concepts Guide.
The Broad Institute maintains a repository of modules and pipelines that are freely available to the public. To install these modules and pipelines on your GenePattern server:

Use the top section of the form to find the modules to install. To update the list of modules/pipelines, select the modules/pipelines to search for and click Search:
For each module and pipeline, GenePattern displays similar information:
Zip files provide a convenient means of sending your modules and pipelines to other GenePattern users. You can export a module or pipeline to a zip file. The zip file can then be used to install the module or pipeline on another GenePattern server.
To export a module or pipeline to a zip file:
To install a module or pipeline from a zip file:

Click Modules & Pipelines>Manage to display the Manage Modules & Pipelines page. From this page, you can

The Broad Institute maintains a repository of suites that are freely available to the public. To install these suites on your GenePattern server:

Use the top section of the form to find the suites to install. To update the list of suites, select the suites to search for and click Search:
For each suite, GenePattern displays similar information:
Zip files provide a convenient means of sending your suites to other GenePattern users. You can export a suite to a zip file. The zip file can then be used to install the suite on another GenePattern server.
To export a suite to a zip file:
To install a suite from a zip file:

Click Suites>Manage to display the Manage Suites page. From this page, you can

You can use the GenePattern server hosted at the Broad Institute, install a local GenePattern server for your own use, or install a networked GenePattern server to be used by several people. The Concepts Guide explains the benefits of each approach.
If you are using the Broad-hosted GenePattern server at http://genepattern.broadinstitute.org/gp/, you do not need to manage the server; the GenePattern team does it for you. If you are installing a local GenePattern server, you will most likely use the default server settings; however, you are the GenePattern server administrator for your server and have full access to configuration options described in this section. If you are installing a networked GenePattern server for use by several users, read this section carefully. You are the GenePattern server administrator and will want to configure the server appropriately for your site.
GenePattern can be run standalone on a small machine or separated into its client and server components to take advantage of a more powerful compute server. When you install a GenePattern server, you set basic server configuration options. If you are installing a local GenePattern server for your own use, you generally do not need to modify the server configuration. If you are the server administrator for a networked GenePattern server, you generally want to modify several of the GenePattern configuration options described in this section:
The GenePattern configuration file GenePatternServer/resources/userGroups.xml defines groups and group membership. The Users and Groups server settings page lists all registered users and the groups to which they belong.
The GenePattern installation defines one group, administrators, which includes all GenePattern users:
<!-- map of users to groups -->
<userGroups>
<group name="administrators">
<user
name="*"/>
</group>
</userGroups>
Members of the administrators group have full access to the GenePattern server and all jobs run on the server. Therefore, when all users are administrators, GenePattern has no concept of “private” data. Initially, all users are administrators.
<!-- map of users to groups -->
<userGroups>
<group name="administrators">
<user
name="*"/>
</group>
</userGroups>
To maximize data privacy, minimize the number of users in the administrators group. For example, add exactly one person to the administrators group and only that one administrator can view all jobs run on the server. Other users can view their own jobs and jobs that have been explicitly shared.
To create groups or change group membership, edit the userGroups.xml file. The XML syntax is simple but must be followed carefully. The rules are as follows:
The following edited userGroups.xml file adds exactly two users to the administrators group and creates a new group, mjones_lab:
<!-- map of users to groups -->
<userGroups>
<group name="administrators">
<user
name="jsmith"/>
<user
name="mjones"/>
</group>
<group name="mjones_lab">
<user
name="mjones"/>
<user
name="jdoe"/>
<user
name="sfederan"/>
</group>
</userGroups>
Renaming a group does not update shared analysis results. Members of a group can share analysis results. If you rename a group, from old_name to new_name for example, the users in the old_name group are now in the new_name group. Analysis results that they shared however were shared with the old_name group. Each user who shared job results with the old_name group should edit the share options for the job and share the job results with the new_name group.
To modify the configuration of your GenePattern server, use the Server Settings page:
The following table summarizes the server settings. For more detail, click a link in the table.
Specify which clients have access to the server. |
|
Specify software source directories and other low-level configuration options. |
|
Specify commands and qualifiers to be prepended to the command line used to invoke a module or pipeline. |
|
Create new server configuration options. |
|
Specify configuration options for the GenePattern database. |
|
Specify how long files remain on the server before being deleted. |
|
Display the log file for the GenePattern server. |
|
Specify the root directories for the programming languages used by GenePattern and the Java flags to be added to Java command lines executed by the server. |
|
If your organization has a web proxy between the GenePattern server and the internet, specify the proxy information required to access the internet. |
|
Specify the URL used to access the module repository and the suite repository. |
|
Shutdown the GenePattern server. |
|
Broadcast a message to all users logged into the GenePattern server. |
|
Display account information for all users, including the groups to which they belong. |
|
Display the log file for the web server used by the GenePattern server. |
Use the Access page to define which GenePattern clients have access to the GenePattern server. The localhost (127.0.0.1) computer cannot be denied access to the locally installed GenePattern server. This prevents you from inadvertently denying yourself access to the server.
Using the Access page to control which computers have access to the GenePattern server is the simplest way to secure your server. You can also control access to your server based on user authentication and user permissions, as described in Securing the Server. The Access page filters are applied before any user-specific authentication or permissions are checked. If your computer cannot access the server, you cannot access the server regardless of your username/password or permissions.

Click Save to save your changes. Click Restore to return to the value set at installation.
The Advanced page contains directory specifications for the GenePattern source files and other low-level configuration options. You rarely need to modify these options.

Click Save to save your changes. Click Restore to return to the values set at installation.
Use the Command Line Prefix page to define commands and qualifiers to be prepended to the command line used to invoke a module or pipeline. For example, use this page to prepend commands and qualifiers that execute modules and pipelines on a cluster farm, as described in Using a Queuing System.

To prepend text to all (or most) command lines executed by the GenePattern server:
To prepend text only to command lines that invoke specific modules or pipelines:
Use the Custom page to define your own configuration options.
When you create a module, the custom configuration options are available as substitution variables in the module command line. For example, if you define a custom property "foo", you can use <foo> in the command line to pass the value of the custom configuration option to your module. In the Broad repository, for example, the LandmarkMatching and PeakMatch modules use the custom configuration option pepperPrefix.

Use the Database Parameters page to set configuration options for the GenePattern database. The following figure shows the HSQL options. You rarely need to change these options.

Click Save to save your changes. Click Restore to return to the value set at installation.
Use the File Purge page to specify when analysis result files are deleted from the server:

Click Save to save your changes. Click Restore to return to the values set at installation.
Use the GenePattern Log page to view warnings and messages generated by the GenePattern server. (Use the Web Server Log page to view messages generated by the web server that GenePattern uses.)

The Programming Languages page contains two sections. After making changes, click Save to save them or Restore to return to the value set at installation.
Use Programming Language Configurations to specify the root directories for the programming languages used by GenePattern:

When you install GenePattern, you install the programming languages used by GenePattern. If you have alternate programming language installations that you prefer to use, use this page to point to those installations. If you would like to use more recent versions of R, see Using Different Versions of R.
Use Programming Language Options to increase the memory allocated to modules written in Java and R:

You can also increase the amount of memory allocated to the GenePattern server or client. For more information, see Increasing Memory Allocation.
If your server is behind a firewall, use the Proxy page to set the HTTP and FTP Proxy information. Without the proxy information, the server cannot download modules, pipelines, or suites from the repository maintained by the Broad Institute. If you do not know the proxy information, contact your systems administrator.

Click Save to save your changes. Click Restore to return to the values set at installation.
Use the Repositories page to identify the location of the repository to be accessed by the GenePattern server when you install modules and pipelines or suites from the repository. By default, it points to the module repository maintained by the Broad Institute. If you would like to implement and maintain a module repository at your site, contact the GenePattern help desk (gp-help (at) broadinstitute.org).

Click Save to save your changes. Click Restore to return to the values set at installation. Click Remove to delete the selected URL from the list.
You can shutdown the GenePattern server by clicking the link on this page. For easier ways of shutting down the server, see Exiting from GenePattern.

Use the System Message page to broadcast a message to all users logged into the GenePattern server.

Use the Users and Groups page to view user account information, including the groups to which a user belongs. This page shows only registered users. An administrator can add users to a group (Creating Groups and Administrators) before they register, but the users are not listed on this page until they have created a GenePattern account by clicking the Registration link on the GenePattern login page.

Use the Web Server Log page to view messages generated by the web server that GenePattern uses. (Use the GenePattern Log page to view warnings and messages generated by the GenePattern server.)

As of Release 3.2.1, the GenePattern server can be configured to run under Java 5 or Java 6.
When installed on Mac OS X 10.6 (Snow Leopard), the GenePattern server is automatically configured for Java 6.
When installed on Mac OS X 10.5 (Leopard), the GenePattern server is configured for Java 5 by default.
To configure the GenePattern server for Java 6:
When installed on Windows, the GenePattern server is configured for Java 5 by default.
To configure the GenePattern server for Java 6:
# LAX.NL.CURRENT.VM
# -----------------
# the VM to use for the next launch
lax.nl.current.vm=\jre\bin\java.exe
When installed on Linux, the GenePattern server is configured for Java 5 by default.
To configure the GenePattern server for Java 6:
# LAX.NL.CURRENT.VM
# -----------------
# the VM to use for the next launch
lax.nl.current.vm=/tools/pkgs/jdk_1.6.0_12/bin/java
Queuing systems such as the Load Sharing Facility (LSF) and the Sun Grid Engine (SGE) allow computational resources to be used effectively. If you have installed a queuing system, you can configure the GenePattern server to use it. On a heavily used server, using a queuing system to execute analysis jobs generally improves performance overall, especially for compute-intensive and long-running jobs; however, short jobs might take slightly longer because they must be dispatched to the queuing system.
To configure the GenePattern server to execute jobs using LSF or SGE:
GenePatternURL=http://myserver.company.com:8080/gp/
When you run a pipeline, the GenePattern server uses this URL to construct the links to the output files.
By default, the GenePatternURL property is not set. When you run a pipeline, the GenePattern server derives the URL at run time based on the current IP address of the host server. This is ideal for a user running on a laptop, where the IP address may change at startup. However, if you are using a queuing system, the derived URL is incorrect: it is based on the IP address of the queuing system server rather than the GenePattern server.
R2.5=<java>
-DR_suppress\=<R.suppress.messages.file> -DR_HOME\=<R2.5_HOME>
-Dr_flags\=\"<r_flags>\" -cp <run_r_path> RunR
Modify other similar properties (if any) that were added to support additional versions of R.
For example, if you are using LSF, modify the Command Line Prefix options as follows:
Another alternative is to create a script that sets the environment variables and then executes the job using LSF or SGE. The command prefix would then execute the script. For example:
In GenePattern, each module definition includes a command line that runs the analysis program. For an R module, the module developer specifies which version of R to use by including the appropriate substitution variable in the command line. For example, the <R> variable translates to the full path of the R2.0.1 programming environment and the <R2.5> variable translates to the full path of the R2.5 programming environment. Similar variables can be created for other versions of R.
Installing GenePattern (version 3.1 or later) installs R2.5 and sets the R2.5_HOME server configuration parameter, which defines the <R2.5> variable. If you upgraded from GenePattern 3.0, your GenePattern installation includes R2.0.1 and defines the <R> variable.
To add more recent versions of R to your GenePattern installation:
To add R2.0.1 to your GenePattern installation:
GenePattern allocates memory to the server, to the "client" (the computer you are using to access GenePattern), and to individual modules. When a module fails with an out of memory error, you can try increasing the amount of memory allocated to the server, the client, or the module.
To increase the amount of memory allocated to a module written in Java or R, click Administration>Server Settings. The Programming Languages page (Programming Language Options) provides several options for increasing Java and R memory options.
To increase the amount of memory allocated to the server and/or the client, follow the instructions for your platform:
Secure the GenePattern server to control who has access to which operations. Since GenePattern is primarily a web application (including SOAP interfaces) running on a web server, general approaches for securing web servers are applicable to the GenePattern server. In addition, GenePattern provides several security features that can easily be used by non-technical users to control access to the server.
This section describes several ways to secure the GenePattern server:
Use the Access page to define which GenePattern clients have access to the GenePattern server. This is the simplest way to secure your GenePattern server.
Access filtering prevents users from connecting to the GenePattern server unless they come from a known computer. If your computer cannot access the server, you cannot access the server regardless of your username/password or permissions. The localhost (127.0.0.1) computer cannot be denied access to the locally installed GenePattern server. This prevents you from inadvertently denying yourself access to the server.
To use access filtering (as described in Modifying Server Settings):

By default, the GenePattern server requires only a user name to authenticate a GenePattern user. You can easily add password protection by modifying the GenePattern server properties.
To add password protection, modify the GenePattern server properties:
When you add password protection to the server:
Assigning passwords to existing user accounts prevents anyone from inadvertently or intentionally logging into and taking control of another user’s account. After adding password protection to the server, set passwords for existing users as follows:
By default, users create their own accounts by clicking the Registration link on the GenePattern login page. To configure GenePattern to allow only administrators to create new accounts:
To create an account:
User permissions determine valid actions for the user. Permissions are based on two configuration files in the GenePatternServer/resources directory (the links show the default files):
A user who belongs to multiple groups is given the most permissive permissions granted to those groups. For example, an administrator who belongs to other groups retains administrator permissions.
To assign or modify user permissions, edit the permissionMap.xml file. The XML syntax is simple but must be followed carefully. The rules are as follows:
By default:
Note: No explicit permission is required to run public modules/pipelines, or private modules/pipelines that you have created. No explicit permission is required to edit or delete your own modules, pipelines, suites, or jobs. |
|
createModule |
Permits creation of a module. Creation refers to any action that adds a module to the server, including create, install from repository, install from zip, and clone. |
createPrivatePipeline |
Permits creation of a private pipeline (a pipeline visible only to its creator). Creation refers to any action that adds a private pipeline to the server, including create, install from repository, install from zip, and clone. Note: To install the modules in a pipeline, you must have createModule permission. |
createPrivateSuite |
Permits creation of a private suite (a suite visible only to its creator). Creation refers to any action that adds a private suite to the server, including create, install from repository, install from zip, and clone. Note: To install the modules in a suite, you must have createModule permission. |
createPublicPipeline |
Permits creation of a public pipeline. Creation refers to any action that adds a public pipeline to the server, including create, install from repository, install from zip, and clone. Note: To install the modules in a pipeline, you must have createModule permission. |
createPublicSuite |
Permits creation of a public suite. Creation refers to any action that adds a public suite to the server, including create, install from repository, install from zip, and clone. Note: To install the modules in a suite, you must have createModule permission. |
adminJobs |
Permits viewing and deleting jobs and associated files owned by other users. Users with this permission can delete any job on the server. Typically, only members of the Administrators group are given this permission. |
adminModules |
Permits viewing and deleting private modules owned by other users. Permits deleting public modules. Note: No explicit permission is required to view public modules. |
adminPipelines |
Permits viewing and deleting private pipelines owned by other users. Permits deleting public pipelines. Note: No explicit permission is required to view public pipelines. |
adminSuites |
Permits viewing and deleting private suites owned by other users. Permits deleting public suites. Note: No explicit permission is required to view public suites. |
adminServer |
Permits access to Administration>Server Settings and all actions on the Server Settings page, including modifying server settings and shutting down the server. Users with this permission are considered to be GenePattern administrators. On the Users and Groups page, a checkmark in the admin? column indicates that a user has this permission. Typically, only members of the Administrators group are given this permission. |
You can configure the GenePattern server to provide password protection, restrict creation of user accounts, and assign permissions based on groups. Additional or alternative authentication and authorization mechanisms can be added to the server by an administrator with programming experience. The remainder of this section is written for such a programmer. Note: The links in this section display the source code for the default GenePattern installation, which should be used as the starting point for any modifications.
The authentication filter, AuthenticationFilter.java, controls whether a user can log into the server (typically based on username and password). The easiest way to modify GenePattern authentication is by implementing the IAuthenticationPlugin.java interface:
See ftp://ftp.broadinstitute.org/pub/genepattern/src/gp-custom-auth.zip for an example project that prepares a custom authentication jar file for deployment to your local GenePattern server.
If the IAuthenticationPlugin interface methods do not provide enough flexibility, you can modify the authentication filter.
The authorization filter, AuthorizationFilter.java, controls which GenePattern operations (web pages) the user can access. As described in User Permissions, permissions are based on two configuration files: userGroups.xml, which defines user groups, and permissionMap.xml, which defines which groups have access to which permissions.
Organizations that have user groups defined in an external system can use those groups rather than using the userGroups.xml. To have the authorization filter use external user groups rather than the userGroups.xml file, implement the IGroupMembershipPlugin.java interface:
To assign permissions to a group authorized through the IGroupMembershipPlugin interface, include the group in the permissionMap.xml file. If the IGroupMembershipPlugin interface methods do not provide enough flexibility, you can modify the authorization filter.
The authentication and authorization filters are servlet filters installed in front of the GenePattern web application in the GenePatternServer/Tomcat/webapps/gp/WEB-INF/web.xml file. To implement an alternative authentication (or authorization) filter:
Note: If you look at the code for the default Authentication Filter (AuthenticationFilter.java), you will see that it allows requests through that have a parameter called jsp_precompile that have come from the localhost. If you do not allow these requests through unauthenticated, you will see a series of errors when you start the GenePattern server as it attempts to precompile the JSP pages. These are not fatal errors, but they slow down server response for users the first time that pages are accessed following a server restart.
This section describes how you can modify the GenePattern web application to run on a web server that is configured to use the HTTPS protocol, where essentially the regular http requests are routed through a secure sockets layer (SSL) making them much harder for hackers to access. If you have installed your GenePattern server onto a web server other than the default Tomcat instance it is distributed with, configure your web server according to its instructions and then follow Step 2 below.
Note: When running under SSL, programming language clients and the GenePattern Desktop Client may not be able to connect to your GenePattern server.
Follow the instructions available at http://tomcat.apache.org/tomcat-5.5-doc/ssl-howto.html to configure the Tomcat instance for using SSL. In doing so, you will modify the Tomcat configuration file, which is located in the GenePatternServer/Tomcat/conf directory.
Once the Tomcat (or other web server) has been configured for SSL, modify the GenePattern configuration file, GenePatternServer/resources/genepattern.properties, to ensure that its properties are in synch with the web server:
Save the genepattern.properties file and restart your server. Any bookmarked links to your GenePattern server must be updated to the new protocol and port.
The GenePattern database has been implemented in both HSQL and Oracle. The GenePattern installation builds the HSQL database and sets the GenePattern server properties to reference that database. To use the Oracle implementation instead, build the Oracle database and modify the GenePattern properties to reference that database, as described in the following procedure:
Version |
Release date |
Comments |
| 3.2.1 | November 2009 | Add Setting the Java Version.
|
| 3.2 | June 2009 | GenePattern 3.2 Release |
3.1.1 |
July 2008 |
Updated Using a Queuing System |
3.1 |
December 2007 |
GenePattern 3.1 Release |
3.0 |
April 2007 |
GenePattern 3.0 Release |









