Working with Pipelines  Print-icon

A GenePattern pipeline defines a sequential series of modules to be run. Modules run from a pipeline work exactly the same as those run directly from GenePattern. If you are unfamiliar with GenePattern pipelines, see Concepts.

Basic Operations

The following table summarizes the different ways you can work with GenePattern pipelines.

Run a pipeline

Select a pipeline, enter its parameters, and click Run. For more information, see Running Modules and Pipelines.

Display pipeline properties

Pipeline properties include the pipeline’s author, the modules to be run and their parameters. If a pipeline has an end-user license agreement associated with it, you can view the license via the pipeline properties page.

To display pipeline properties, click Modules & Pipelines, select the pipeline, and click Properties. For more information, see Displaying Pipeline Properties.

Send pipelines to other users

Zip files provide a convenient way to send pipelines to other GenePattern users.

  • To export a pipeline to a zip file, click Modules & Pipelines and select the pipeline to export. When the pipeline parameters (if any) appear in the center pane, click Export.
  • To install a pipeline from a zip file: click Modules & Pipelines>Install from zip.

For more information, see Exporting and Installing Modules & Pipelines Using Zip Files.

Install pipelines from the repository

The Broad Institute maintains a repository of modules, pipelines, and suites. To install pipelines from the Broad repository, click Modules & Pipelines>Install from Repository. For more information, see Installing Modules & Pipelines from the Repository.

Create pipelines

You can create an empty pipeline and add modules to it, create a pipeline by cloning an existing pipeline, or start with an analysis result file and have GenePattern create a pipeline that recreates that analysis result file. For more information, see Creating Pipelines.

Edit pipelines

You can edit a pipeline that you have created or clone a public pipeline and edit your copy of the public pipeline. For more information, see Editing Pipelines.

Delete pipelines

To delete a pipeline, click Modules & Pipelines>Manage. For more information, see Managing Modules & Pipelines.

Back to top

Displaying Pipeline Properties

To view the definition of a pipeline, display its properties:

  1. Click Modules & Pipelines to display the GenePattern home page.
  2. Select the pipeline to display.
  3. When GenePattern displays the pipeline parameters (if any), click Properties. GenePattern displays the pipeline properties:

On this page, you can:

Back to top

Creating Pipelines

You can create a pipeline in several ways: from an analysis result file, from an existing pipeline, or from scratch (beginning with an empty pipeline).

To create a pipeline from an analysis result file:

  1. Click the menu icon menu icon next to the analysis result file and select Create Pipeline. GenePattern creates a pipeline that will reproduce the analysis results file and opens the pipeline designer for that pipeline.

    GenePattern adds modules to the pipeline based on the following logic: add the module that created the result file; check the module’s input file parameters; if the input file for the module was the output file of a previous module, add the previous module; check that module’s input file parameters; continue to walk back through the chain of modules, adding modules to the pipeline, until reaching the initial input file.

  2. Edit the pipeline as desired. For more information, see Pipeline Designer.
  3. Click Save to create the pipeline.

To create a new copy of an existing pipeline:

  1. Click Modules & Pipelines and select the pipeline. GenePattern displays the pipeline parameters (if any).
  2. Click Properties. GenePattern displays the pipeline definition page.
  3. Click Clone to create a copy of the pipeline. GenePattern prompts you to name the new pipeline.
  4. Enter a name for the pipeline and click OK. GenePattern displays the pipeline properties for the new pipeline.
  5. Click Edit to edit pipeline. GenePattern displays the pipeline designer.
  6. Edit the pipeline as desired. For more information, see Pipeline Designer.
  7. Click Save to create the pipeline.

To create a pipeline from scratch:

  1. Click Modules & Pipelines>New Pipeline. GenePattern displays the pipeline designer.
  2. Edit the pipeline as desired. For more information, see Pipeline Designer.
  3. Click Save to create the pipeline.

Back to top

Editing Pipelines

To edit a pipeline:

  1. Click Modules & Pipelines to display the GenePattern home page.
  2. Select the pipeline that you want to edit. GenePattern displays the pipeline parameters (if any).
  3. Open the pipeline designer in one of two ways:
    • Click Edit, if it is available. GenePattern displays the pipeline designer.
      This option is visible only if you created this pipeline on this GenePattern server.
    • Otherwise, create a copy of the pipeline to edit:
      1. Click Properties. GenePattern displays the pipeline properties.
      2. Click Clone. GenePattern prompts you to name the new pipeline.
      3. Enter a name for the pipeline and click OK. GenePattern displays the pipeline properties for the new pipeline.
      4. Click Edit. GenePattern displays the pipeline designer.
  4. Edit the pipeline as desired. For more information, see Pipeline Designer.
  5. Click Save to save the pipeline.

Back to top

Pipeline Designer

See the video tutorial: Exploring the New GenePattern Pipeline Designer.

When you create or edit a pipeline, GenePattern displays the pipeline designer:

From left to right:

The pipeline diagram toolbar provides the following options:

Displays the basic pipeline properties in the Editing Pipeline panel, as shown here. For more information, see Editing Basic Pipeline Properties.

Saves your changes without closing the designer.

Saves your changes, closes the designer and runs the pipeline.

Loads the last saved version of the pipeline, overwriting any unsaved changes.

Displays this section of the GenePattern documentation.

The remaining topics in this section describe how to use the pipeline designer:

Editing Basic Pipeline Properties

To edit basic pipeline properties:

  1. Click the Edit Properties icon to display the basic pipeline properties in the editing panel.
  2. Edit the pipeline properties.
  3. Click Save or Save & Run to save your changes.

The Editing Pipeline panel displays the following properties:

Back to top

Adding Modules/Files

To add a module to the pipeline:

  1. Select the module from the list on the left. The designer adds the module to the diagram.
    Tip: You can select a module in one of two ways:
    • Locate the module in the list and click on it.
    • Start typing the module name in the search box. GenePattern displays module names that include the typed characters. Click the desired module and then click Add.
  2. Drag the module to the desired location.
    Tip: The diagram is read from left to right and from top to bottom, as you would read a book written in English.
  3. Click the module to view and edit its properties in the editing panel.
  4. Click Save or Save & Run to save your changes.

Note: If you add a module with an end-user license to your pipeline, users who have not accepted that module's license terms will be presented with a license-acceptance window before the pipeline will run.

To specify a file as input to a module in the pipeline, you must first add the file to the pipeline diagram. To add a file to the diagram:

  1. Click Attach File. GenePattern prompts you for the file.
  2. Browse for a file. GenePattern adds the file to the diagram.
  3. Drag the file to the desired location. You can now use this file as an input file for one or more modules.
  4. Click Save or Save & Run to save your changes.

The pipeline diagram uses color to distinguish between files, modules, and pipelines. Connections between objects show the flow of data through the pipeline. The following diagram shows a file, a module and a pipeline. The file (all_aml_train.gct) is used as an input file parameter (input.file) for the module. To delete an object and all of its connection, click its  delete icon.

Back to top

Editing Module Properties

To edit a module's properties:

  1. Click the module in the pipeline diagram. GenePattern displays its properties in the editing panel.
  2. Edit the properties, as shown below.
  3. Click Save or Save & Run to save your changes.

By default, a pipeline runs the most recent version of a module. The drop-down list shows all versions of the module that are installed on the GenePattern server. To have the pipeline run a different version of the module, select it from the list.

The Life Science Identifier (LSID) for this module. You cannot create or edit LSIDs. The GenePattern server automatically assigns an LSID to each version of a module.

Click the Documentation button to display the module documentation.

Warnings shown here must be addressed before you can save the pipeline.  GenePattern highlights all parameters affected by the warnings.

All module parameters are listed here. Input file parameters are critical and are generally listed first. They control the flow of data through the pipeline. For more information, see Setting Input File Parameters.

Click the check box next to a parameter to mark it prompt-when-run. When the pipeline runs, GenePattern prompts the user for all prompt-when-run parameters in the pipeline. By default, GenePattern prompts the user for a parameter by displaying its name and description. Optionally, click Set Prompt When Run Display Settings to supply alternate text for the prompt.

For most parameters, you enter a value, select a value from a drop-down list, or use the default value supplied by GenePattern.

Back to top

Setting Input File Parameters

In the pipeline diagram, the connections between modules show the flow of data through the pipeline. You modify the flow of data by modifying the input file parameters. The connections in the diagram are a graphical representation of the input file parameter settings. When you click on the module, the editing panel provides a textual representation of the same input file parameter settings.

You can supply the file for an input file parameter in one of three ways:

The following pipeline diagram uses the ComparativeMarkerSelection module to illustrate the different ways of supplying input file parameters:

For the input.file parameter, use the output file generated by the PreprocessDataset module.

  • Diagram: Displays a connection from the res output of PreprocessDataset to the input.file parameter of ComparativeMarkerSelection.
  • Editing panel: Displays the current value of the input.file parameter as: Receiving res from PreprocessDataset.

For the cls.file parameter, prompt the user for input.

  • Diagram: Displays the prompt-when-run icon to indicate that the cls.file parameter has been marked prompt-when-run.
  • Editing panel: Displays the prompt-when-run check box for the cls.file parameter as selected.

For the confounding.variable.cls.file, specify the file all_aml_train_confound.cls.

  • Diagram: Displays a connection from the uploaded file, all_aml_train_confound.cls, to the confounding.variable.cls.file parameter of ComparativeMarkerSelection.
  • Editing panel: Displays the current value of the confounding.variable.cls.file as: Receiving all_aml_train_confound.cls from Input File.

The method you use to set the input file parameter depends on how you plan to supply the file:

Click Save or Save & Run to save your changes.

Reusing a user-supplied file: Occasionally, a pipeline requires that the same input file be specified for multiple parameters. For example, consider a pipeline with two modules:

  1. ComparativeMarkerSelection, with parameters input.file (an expression dataset file) and cls.file (a class file).
  2. ExtractComparativeMarkerResults, with parameters comparative.marker.selection.filename (ComparativeMarkerSelection result file) and dataset.filename (the expression dataset file used for ComparativeMarkerSelection).

You want to use the same input file for both the ComparativeMarkerSelection.input.file parameter and the ExtractComparativeMarkerResults.dataset.filename parameter. If the input file that you want to use is either the output file generated from another module (perhaps an expression dataset generated by the PreprocessDataset module) or an uploaded file, this is not a problem. You can connect the file that you want to use to both the ComparativeMarkerSelection.input.file parameter and the ExtractComparativeMarkerResults.dataset.filename parameter.

However, what happens if you want to prompt the user for the expression dataset file? If you mark the ComparativeMarkerSelection.input.file parameter as a prompt-when-run parameter, you still need an input file for ExtractComparativeMarkerResults.dataset.filename parameter. If you mark both parameters as prompt-when-run, you have to rely on your user to submit the same expression dataset file for both parameters. The workaround is to add the ConvertLineEndings module to your pipeline:

  1. Add ConvertLineEndings to your pipeline.
  2. Mark the ConvertLineEndings.input.filename parameter as prompt-when-run. ConvertLineEndings generates an output file almost identical to the input file; it simply converts the line endings in the file to those used by perl on the host operating system.
  3. Use the output file generated by ConvertLineEndings as the input file for both the ComparativeMarkerSelection.input.file parameter and the ExtractComparativeMarkerResults.dataset.filename parameter.

Back to top

Reordering Modules

GenePattern orders modules in a pipeline based on their position in the diagram. The diagram is read from left to right and top to bottom, as you would read a book written in English.

You can reorder the modules in a pipeline by repositioning the modules in the diagram. Similarly, you can insert a module into a pipeline simply by adding it to the diagram and dragging it to the appropriate location. However, in either case, you are changing the flow of the data and, therefore, must delete and recreate any affected connections.

To reposition a module in the pipeline:

  1. Remove its connections to other modules.
  2. Drag the module to its new position.
  3. Recreate connections as needed to reflect the flow of data through the modified pipeline.

Back to top

<< Working with Modules Up Working with Suites >>

Updated on December 06, 2012 14:43