Programmers Guide

Writing Modules: Coding for GenePattern
Creating a GenePattern module is a two-step process:
-
Use the guidelines provided below to write a program that executes the desired function.
-
Create a GenePattern module that invokes the program that you have written. For more information, see Creating Modules in GenePattern.
When writing a program that will be run as a GenePattern analysis module, keep in mind the following:
-
Use the programming language of your choice. You can write the program in the language of your choice. You can use a compiled language, such as C, to create an executable or you can use a scripting language, such as Perl, to create a script that is run by an interpreter.
-
Write messages to standard error and standard output. GenePattern modules are run on the server. The user provides arguments and retrieves results, but does not interact during module execution. If necessary, write normal output to standard output (stdout) and error messages to standard error (stderr); avoid writing error messages to standard output. GenePattern captures stdout and stderr in log files, which can be retrieved by the user.
-
Write output files to the current working directory. When a module completes, GenePattern displays the output files that are in the current working directory. Files written to other locations are not displayed as module output files (otherwise known as analysis result files).
-
Read module data files from <libdir>. If your module needs to read from any data files which are part of the module (rather than user input), it will need to know the directory where the module lives on the server; that is, <libdir>.
-
Read and write standard GenePattern file formats. When reading and writing data files, you generally want to use the standard GenePattern file formats. This makes it easier for users to analyze their data using a combination of GenePattern modules. If you choose to use your own unique file formats, be aware that other GenePattern modules will not be able to read those files.
For Java, MATLAB, and R, GenePattern provides libraries that include methods for reading and writing GenePattern files (such as res, gct, and odf files). These libraries are designed for accessing GenePattern from the Java, MATLAB, and R environments, but are also useful when writing modules to be invoked by GenePattern. For instructions on downloading the libraries, see Using GenePattern from Java, Using GenePattern from MATLAB, or Using GenePattern from R.
-
Use parameter flags. When designing the program and its command line, use parameter flags (for example, -f input_file) rather than relying on parameter positions. Parameter flags allow users to build command lines with variable numbers of arguments, which makes it easy to omit optional parameters.
-
Process all parameters as strings. All command line parameters are passed to your code as strings, even if a parameter is apparently numeric. If your code expects a numeric argument, explicitly convert the string argument to a number; for example, as.integer(arg).
-
Avoid absolute pathnames. When writing code to be used with GenePattern, avoid absolute pathnames. For example, in perl, specify the interpreter on the command line rather than embedding the interpreter in the script; that is, use the command line "
perl myscript" rather than including "#!/usr/bin/perl" as the first line of the myscript.pl file.
-
Avoid Windows forbidden filenames. Machines running Windows cannot accept files with the following names, regardless of the file extension: con, prn, aux, nul, com1, com2, com3, com4, lpt1, lpt2, lpt3. For cross-platform compatibility, avoid files with these names.
Visualization modules are similar to analysis modules. The only difference between analysis and visualization modules is that analysis modules run on the server machine and visualization modules run on the client machine. Each module is launched in a separate process. An applet is used to launch the visualization module.
<< Module Integrator Help Text
| Up |
Writing R Modules >>