# Tagged with #install 5 documentation articles | 0 announcements | 8 forum discussions

Created 2016-02-26 22:54:46 | Updated 2016-04-13 11:07:25 | Tags: workshop install

#### Objective

Install all software packages required to attend a GATK workshop.

#### Prerequisites

To follow these instructions, you will need to have a basic understanding of the meaning of the following words and command-line operations. If you are unfamiliar with any of the following, you should consult a more experienced colleague or your system administrator if you have one. There are also many good online tutorials you can use to learn the necessary notions.

• Basic Unix environment commands
• Binary / Executable
• Command-line shell, terminal or console
• Software library

The current version of GATK requires Java Runtime Environment version 1.7. All Linux/Unix and MacOS X systems should have a JRE pre-installed, but the version may vary. To test your Java version, run the following command in the shell:

java -version 

This should return a message along the lines of ”java version 1.7.0_25” as well as some details on the Runtime Environment (JRE) and Virtual Machine (VM). If you have a version other than 1.7.x, be aware that you may run into trouble with some of the more advanced features of the Picard and GATK tools. The simplest solution is to install an additional JRE and specify which you want to use at the command-line. To find out how to do so, you should seek help from your system administrator.

#### Software packages

1. Picard
2. Genome Analysis Toolkit (GATK)
3. IGV
4. RStudio IDE and R libraries ggplot2 and gsalib

### 1. Picard

Read the overview of the Picard software on the Picard project homepage, then download version 1.141 of the Picard software package.

• Installation

Unpack the zip file using:

tar xjf picard-tools-1.141.zip 

This will produce a directory called picard-tools-1.141 containing the Picard jar files. Picard tools are distributed as a pre-compiled Java executable (jar file) so there is no need to compile them.

Note that it is not possible to add jar files to your path to make the tools available on the command line; you have to specify the full path to the jar file in your java command, which would look like this:

java -jar ~/my_tools/jars/picard.jar <Toolname> [options]

This syntax will be explained in a little more detail further below.

However, you can set up a shortcut called an "environment variable" in your shell profile configuration to make this easier. The idea is that you create a variable that tells your system where to find a given jar, like this:

PICARD = "~/my_tools/jars/picard.jar"

So then when you want to run a Picard tool, you just need to call the jar by its shortcut, like this:

java -jar $PICARD <Toolname> [options] The exact way to set this up depends on what shell you're using and how your environment is configured. We like this overview and tutorial which explains how it all works; but if you are new to the command line environment and you find this too much too deal with, we recommend asking for help from your institution's IT support group. This completes the installation process. • Testing Open a shell and run: java -jar picard.jar -h  This should print out some version and usage information about the AddOrReplaceReadGroups.jar tool. At this point you will have noticed an important difference between BWA and Picard tools. To use BWA, we called on the BWA program and specified which of its internal tools we wanted to apply. To use Picard, we called on Java itself as the main program, then specified which jar file to use, knowing that one jar file = one tool. This applies to all Picard tools; to use them you will always build your command lines like this: java -jar picard.jar <ToolName> [options]  This means you first make the call to Java itself as the main program, then specify the picard.jar file, then specify which tool you want, and finally you pass whatever other arguments (input files, parameters etc.) are needed for the analysis. Note that the command-line syntax of Picard tools has recently changed from java -jar <ToolName>.jar to java -jar picard.jar <ToolName>. We are using the newer syntax in this document, but some of our other documents may not have been updated yet. If you encounter any documents using the old syntax, let us know and we'll update them accordingly. If you are already using an older version of Picard, either adapt the commands or better, upgrade your version! Next we will see that GATK tools are called in essentially the same way, although the way the options are specified is a little different. The reasons for how tools in a given software package are organized and invoked are largely due to the preferences of the software developers. They generally do not reflect strict technical requirements, although they can have an effect on speed and efficiency. ### 2. Genome Analysis Toolkit (GATK) Hopefully if you're reading this, you're already acquainted with the purpose of the GATK, so go ahead and download the latest version of the software package. In order to access the downloads, you need to register for a free account on the GATK support forum. You will also need to read and accept the license agreement before downloading the GATK software package. Note that if you intend to use the GATK for commercial purposes, you will need to purchase a license. See the licensing page for an overview of the commercial licensing conditions. • Installation Unpack the tar file using: tar xjf GenomeAnalysisTK-3.5-0.tar.bz2  This will produce a directory called GenomeAnalysisTK-3.3-0 containing the GATK jar file, which is called GenomeAnalysisTK.jar, as well as a directory of example files called resources. GATK tools are distributed as a single pre-compiled Java executable so there is no need to compile them. Just like we discussed for Picard, it's not possible to add the GATK to your path, but you can set up a shortcut to the jar file using environment variables as described above. This completes the installation process. • Testing Open a shell and run: java -jar GenomeAnalysisTK.jar -h  This should print out some version and usage information, as well as a list of the tools included in the GATK. As the Usage line states, to use GATK you will always build your command lines like this: java -jar GenomeAnalysisTK.jar -T <ToolName> [arguments]  This means that just like for Picard, you first make the call to Java itself as the main program, then specify the GenomeAnalysisTK.jar file, then specify which tool you want, and finally you pass whatever other arguments (input files, parameters etc.) are needed for the analysis. ### 3. IGV The Integrated Genomics Viewer is a genome browser that allows you to view BAM, VCF and other genomic file information in context. It has a graphical user interface that is very easy to use, and can be downloaded for free (though registration is required) from this website. We encourage you to read through IGV's very helpful user guide, which includes many detailed tutorials that will help you use the program most effectively. ### 4. RStudio IDE and R libraries ggplot2 and gsalib Download the latest version of RStudio IDE. The webpage should automatically detect what platform you are running on and recommend the version most suitable for your system. • Installation Follow the installation instructions provided. Binaries are provided for all major platforms; typically they just need to be placed in your Applications (or Programs) directory. Open RStudio and type the following command in the console window: install.packages("ggplot2")  This will download and install the ggplot2 library as well as any other library packages that ggplot2 depends on for its operation. Note that some users have reported having to install two additional package themselves, called reshape and gplots, which you can do as follows: install.packages("reshape") install.packages("gplots") Finally, do the same thing to install the gsalib library: install.packages("gsalib") This will download and install the gsalib library. Created 2015-12-23 18:49:42 | Updated 2016-03-02 15:50:59 | Tags: install Below is a list of everything you need in order to run workflows written in WDL (using the Cromwell execution engine, because that's what we use), with installation instructions where necessary. Because we use GATK in most of the tutorials and example WDL scripts on this website, we include a link to GATK installation instructions as well, but this is optional if you don’t plan to run the GATK WDLs. ### WDL WDL, pronounced “widdle”, stands for Workflow Description Language. #### wdltool The wdltool toolkit is a utility package that provides accessory functionality for writing and running WDL scripts, including syntax validation and input template generation. You can download the latest release of the pre-compiled executable here. #### Text editor You will need a text editor of some sort to write your WDL scripts. It is important to note that there is a difference between a word processor (like Microsoft Word) and a text editor (like Notepad); please use the latter option. If you have no preferred text editor, we would recommend installing SublimeText, as we find that it displays code visually better than other text editors we've tried. As an added convenience when developing WDL scripted workflows, syntax highlighting has been developed for SublimeText, TextMate, vim, and IntelliJ. You can follow the links for installation instructions for your editor of choice. ### Cromwell Cromwell is an execution engine capable of running scripts written in WDL, describing data processing and analysis workflows involving command line tools (such as pipelines implementing the GATK Best Practices for Variant Discovery). If you are familiar with GATK, you may have heard of or even used an execution engine called Queue that was designed to run GATK workflows written as Qscripts. Together, Cromwell and WDL constitute a user-friendly alternative to Queue and Qscripts. The installation of Cromwell itself is quite simple. The latest release can be downloaded here in the form of a pre-compiled jar. For ease of use, you can also add an environment variable to your terminal profile pointing at the Cromwell jar file. #### Java 8 Cromwell requires Java version 8, which you can find here. #### Docker (optional) Cromwell is capable of utilizing Docker images to assist in specifying environments when running workflows. If you’ve never worked with Docker before, this page may answer many of your questions. Docker is optional if you are simply working on your local machine (i.e. your computer rather than a remote server). If you are using a remote server, more often than not Docker is required. In our tutorials, we always tell you which optional installations will be required. To use Docker, please install it according to your operating system, following the instructions given on the installation page. ### Programs to be pipelined Our tutorials feature tools from the GATK (GenomeAnalysisToolkit) and Picard to demonstrate how to write WDL scripts that perform real data processing and analysis tasks; in order to follow them you’ll need to install GATK, Picard, and its own dependencies. To that effect, you can find a complete walkthrough for installing these on the GATK website. The linked document provides instructions for installing several additional software packages that are useful for GATK-specific tutorials, but the only one that you really need to install for running WDL tutorials, beside GATK and Picard, is Java 1.7. Installing the R library gsalib (available on CRAN) is optional but highly recommended. When following along with a tutorial on this website, we will always tell you which optional installations will be required. Note that GATK and Cromwell currently require different versions of Java, so see this article for help dealing with that temporary problem. Created 2013-07-02 00:16:14 | Updated 2015-09-24 12:12:04 | Tags: install rscript igv picard gsalib samtools r ggplot2 rstudio #### Objective Install all software packages required to follow the GATK Best Practices. #### Prerequisites To follow these instructions, you will need to have a basic understanding of the meaning of the following words and command-line operations. If you are unfamiliar with any of the following, you should consult a more experienced colleague or your systems administrator if you have one. There are also many good online tutorials you can use to learn the necessary notions. • Basic Unix environment commands • Binary / Executable • Compiling a binary • Adding a binary to your path • Command-line shell, terminal or console • Software library You will also need to have access to an ANSI compliant C++ compiler and the tools needed for normal compilations (make, shell, the standard library, tar, gunzip). These tools are usually pre-installed on Linux/Unix systems. On MacOS X, you may need to install the MacOS Xcode tools. See https://developer.apple.com/xcode/ for relevant information and software downloads. The XCode tools are free but an AppleID may be required to download them. Starting with version 2.6, the GATK requires Java Runtime Environment version 1.7. All Linux/Unix and MacOS X systems should have a JRE pre-installed, but the version may vary. To test your Java version, run the following command in the shell: java -version  This should return a message along the lines of ”java version 1.7.0_25” as well as some details on the Runtime Environment (JRE) and Virtual Machine (VM). If you have a version other than 1.7.x, be aware that you may run into trouble with some of the more advanced features of the Picard and GATK tools. The simplest solution is to install an additional JRE and specify which you want to use at the command-line. To find out how to do so, you should seek help from your systems administrator. #### Software packages 1. BWA 2. SAMtools 3. Picard 4. Genome Analysis Toolkit (GATK) 5. IGV 6. RStudio IDE and R libraries ggplot2 and gsalib Note that the version numbers of packages you download may be different than shown in the instructions below. If so, please adapt the number accordingly in the commands. ### 1. BWA Read the overview of the BWA software on the BWA project homepage, then download the latest version of the software package. • Installation Unpack the tar file using: tar xvzf bwa-0.7.12.tar.bz2  This will produce a directory called bwa-0.7.12 containing the files necessary to compile the BWA binary. Move to this directory and compile using: cd bwa-0.7.12 make The compiled binary is called bwa. You should find it within the same folder (bwa-0.7.12 in this example). You may also find other compiled binaries; at time of writing, a second binary called bwamem-lite is also included. You can disregard this file for now. Finally, just add the BWA binary to your path to make it available on the command line. This completes the installation process. • Testing Open a shell and run: bwa  This should print out some version and author information as well as a list of commands. As the Usage line states, to use BWA you will always build your command lines like this: bwa <command> [options]  This means you first make the call to the binary (bwa), then you specify which command (method) you wish to use (e.g. index) then any options (i.e. arguments such as input files or parameters) used by the program to perform that command. ### 2. SAMtools Read the overview of the SAMtools software on the SAMtools project homepage, then download the latest version of the software package. • Installation Unpack the tar file using: tar xvjf samtools-0.1.2.tar.bz2  This will produce a directory called samtools-0.1.2 containing the files necessary to compile the SAMtools binary. Move to this directory and compile using: cd samtools-0.1.2 make  The compiled binary is called samtools. You should find it within the same folder (samtools-0.1.2 in this example). Finally, add the SAMtools binary to your path to make it available on the command line. This completes the installation process. • Testing Open a shell and run: samtools  This should print out some version information as well as a list of commands. As the Usage line states, to use SAMtools you will always build your command lines like this: samtools <command> [options]  This means you first make the call to the binary (samtools), then you specify which command (method) you wish to use (e.g. index) then any options (i.e. arguments such as input files or parameters) used by the program to perform that command. This is a similar convention as used by BWA. ### 3. Picard Read the overview of the Picard software on the Picard project homepage, then download the latest version of the software package. • Installation Unpack the zip file using: tar xjf picard-tools-1.139.zip  This will produce a directory called picard-tools-1.139 containing the Picard jar files. Picard tools are distributed as a pre-compiled Java executable (jar file) so there is no need to compile them. Note that it is not possible to add jar files to your path to make the tools available on the command line; you have to specify the full path to the jar file in your java command, which would look like this: java -jar ~/my_tools/jars/picard.jar <Toolname> [options] This syntax will be explained in a little more detail further below. However, you can set up a shortcut called an "environment variable" in your shell profile configuration to make this easier. The idea is that you create a variable that tells your system where to find a given jar, like this: PICARD = "~/my_tools/jars/picard.jar" So then when you want to run a Picard tool, you just need to call the jar by its shortcut, like this: java -jar$PICARD <Toolname> [options]

The exact way to set this up depends on what shell you're using and how your environment is configured. We like this overview and tutorial which explains how it all works; but if you are new to the command line environment and you find this too much too deal with, we recommend asking for help from your institution's IT support group.

This completes the installation process.

• Testing

Open a shell and run:

java -jar picard.jar -h 

This should print out some version and usage information about the AddOrReplaceReadGroups.jar tool. At this point you will have noticed an important difference between BWA and Picard tools. To use BWA, we called on the BWA program and specified which of its internal tools we wanted to apply. To use Picard, we called on Java itself as the main program, then specified which jar file to use, knowing that one jar file = one tool. This applies to all Picard tools; to use them you will always build your command lines like this:

java -jar picard.jar <ToolName> [options] 

This means you first make the call to Java itself as the main program, then specify the picard.jar file, then specify which tool you want, and finally you pass whatever other arguments (input files, parameters etc.) are needed for the analysis.

Note that the command-line syntax of Picard tools has recently changed from java -jar <ToolName>.jar to java -jar picard.jar <ToolName>. We are using the newer syntax in this document, but some of our other documents may not have been updated yet. If you encounter any documents using the old syntax, let us know and we'll update them accordingly. If you are already using an older version of Picard, either adapt the commands or better, upgrade your version!

Next we will see that GATK tools are called in essentially the same way, although the way the options are specified is a little different. The reasons for how tools in a given software package are organized and invoked are largely due to the preferences of the software developers. They generally do not reflect strict technical requirements, although they can have an effect on speed and efficiency.

### 4. Genome Analysis Toolkit (GATK)

In order to access the downloads, you need to register for a free account on the GATK support forum. You will also need to read and accept the license agreement before downloading the GATK software package. Note that if you intend to use the GATK for commercial purposes, you will need to purchase a license. See the licensing page for an overview of the commercial licensing conditions.

• Installation

Unpack the tar file using:

tar xjf GenomeAnalysisTK-3.3-0.tar.bz2 

This will produce a directory called GenomeAnalysisTK-3.3-0 containing the GATK jar file, which is called GenomeAnalysisTK.jar, as well as a directory of example files called resources. GATK tools are distributed as a single pre-compiled Java executable so there is no need to compile them. Just like we discussed for Picard, it's not possible to add the GATK to your path, but you can set up a shortcut to the jar file using environment variables as described above.

This completes the installation process.

• Testing

Open a shell and run:

java -jar GenomeAnalysisTK.jar -h 

This should print out some version and usage information, as well as a list of the tools included in the GATK. As the Usage line states, to use GATK you will always build your command lines like this:

java -jar GenomeAnalysisTK.jar -T <ToolName> [arguments] 

This means that just like for Picard, you first make the call to Java itself as the main program, then specify the GenomeAnalysisTK.jar file, then specify which tool you want, and finally you pass whatever other arguments (input files, parameters etc.) are needed for the analysis.

### 5. IGV

The Integrated Genomics Viewer is a genome browser that allows you to view BAM, VCF and other genomic file information in context. It has a graphical user interface that is very easy to use, and can be downloaded for free (though registration is required) from this website. We encourage you to read through IGV's very helpful user guide, which includes many detailed tutorials that will help you use the program most effectively.

### 6. RStudio IDE and R libraries ggplot2 and gsalib

• Installation

Follow the installation instructions provided. Binaries are provided for all major platforms; typically they just need to be placed in your Applications (or Programs) directory. Open RStudio and type the following command in the console window:

install.packages("ggplot2") 

This will download and install the ggplot2 library as well as any other library packages that ggplot2 depends on for its operation. Note that some users have reported having to install two additional package themselves, called reshape and gplots, which you can do as follows:

install.packages("reshape")
install.packages("gplots")

Finally, do the same thing to install the gsalib library:

install.packages("gsalib")

Important note

If you are using a recent version of ggplot2 and a version of GATK older than 3.2, you may encounter an error when trying to generate the BQSR or VQSR recalibration plots. This is because until recently our scripts were still using an older version of certain ggplot2 functions. This has been fixed in GATK 3.2, so you should either upgrade your version of GATK (recommended) or downgrade your version of ggplot2. If you experience further issues generating the BQSR recalibration plots, please see this tutorial.

Created 2012-08-09 04:08:01 | Updated 2013-06-17 21:09:33 | Tags: test intro queue developer install

#### Objective

Test that Queue is correctly installed, and that the supporting tools like Java are in your path.

#### Prerequisites

• Basic familiarity with the command-line environment
• Understand what is a PATH variable
• GATK installed

#### Steps

1. Invoke the Queue usage/help message
2. Troubleshooting

### 1. Invoke the Queue usage/help message

The command we're going to run is a very simple command that asks Queue to print out a list of available command-line arguments and options. It is so simple that it will ALWAYS work if your Queue package is installed correctly.

Note that this command is also helpful when you're trying to remember something like the right spelling or short name for an argument and for whatever reason you don't have access to the web-based documentation.

#### Action

Type the following command:

java -jar <path to Queue.jar> --help

replacing the <path to Queue.jar> bit with the path you have set up in your command-line environment.

#### Expected Result

You should see usage output similar to the following:

usage: java -jar Queue.jar -S <script> [-jobPrefix <job_name_prefix>] [-jobQueue <job_queue>] [-jobProject <job_project>]
[-jobSGDir <job_scatter_gather_directory>] [-memLimit <default_memory_limit>] [-runDir <run_directory>] [-tempDir
<temp_directory>] [-emailHost <emailSmtpHost>] [-emailPort <emailSmtpPort>] [-emailTLS] [-emailSSL] [-emailUser
[-expandedDot <expanded_dot_graph>] [-startFromScratch] [-status] [-statusFrom <status_email_from>] [-statusTo
<status_email_to>] [-keepIntermediates] [-retry <retry_failed>] [-l <logging_level>] [-log <log_to_file>] [-quiet]
[-debug] [-h]

-S,--script <script>                                                      QScript scala file
-jobPrefix,--job_name_prefix <job_name_prefix>                            Default name prefix for compute farm jobs.
-jobQueue,--job_queue <job_queue>                                         Default queue for compute farm jobs.
-jobProject,--job_project <job_project>                                   Default project for compute farm jobs.
-jobSGDir,--job_scatter_gather_directory <job_scatter_gather_directory>   Default directory to place scatter gather
output for compute farm jobs.
-memLimit,--default_memory_limit <default_memory_limit>                   Default memory limit for jobs, in gigabytes.
-runDir,--run_directory <run_directory>                                   Root directory to run functions from.
-tempDir,--temp_directory <temp_directory>                                Temp directory to pass to functions.
-emailHost,--emailSmtpHost <emailSmtpHost>                                Email SMTP host. Defaults to localhost.
-emailPort,--emailSmtpPort <emailSmtpPort>                                Email SMTP port. Defaults to 465 for ssl,
otherwise 25.
-emailTLS,--emailUseTLS                                                   Email should use TLS. Defaults to false.
-emailSSL,--emailUseSSL                                                   Email should use SSL. Defaults to false.
secure! See emailPassFile.
-bsub,--bsub_all_jobs                                                     Use bsub to submit jobs
-run,--run_scripts                                                        Run QScripts.  Without this flag set only
performs a dry run.
-dot,--dot_graph <dot_graph>                                              Outputs the queue graph to a .dot file.  See:
http://en.wikipedia.org/wiki/DOT_language
-expandedDot,--expanded_dot_graph <expanded_dot_graph>                    Outputs the queue graph of scatter gather to
a .dot file.  Otherwise overwrites the
dot_graph
-startFromScratch,--start_from_scratch                                    Runs all command line functions even if the
outputs were previously output successfully.
-status,--status                                                          Get status of jobs for the qscript
-statusFrom,--status_email_from <status_email_from>                       Email address to send emails from upon
completion or on error.
-statusTo,--status_email_to <status_email_to>                             Email address to send emails to upon
completion or on error.
-keepIntermediates,--keep_intermediate_outputs                            After a successful run keep the outputs of
any Function marked as intermediate.
-retry,--retry_failed <retry_failed>                                      Retry the specified number of times after a
command fails.  Defaults to no retries.
-l,--logging_level <logging_level>                                        Set the minimum level of logging, i.e.
setting INFO get's you INFO up to FATAL,
setting ERROR gets you ERROR and FATAL level
logging.
-log,--log_to_file <log_to_file>                                          Set the logging location
-quiet,--quiet_output_mode                                                Set the logging to quiet mode, no output to
stdout
-debug,--debug_mode                                                       Set the logging file string to include a lot
of debugging information (SLOW!)
-h,--help                                                                 Generate this help message

If you see this message, your Queue installation is ok. You're good to go! If you don't see this message, and instead get an error message, proceed to the next section on troubleshooting.

### 2. Troubleshooting

Let's try to figure out what's not working.

#### Action

First, make sure that your Java version is at least 1.6, by typing the following command:

java -version

#### Expected Result

You should see something similar to the following text:

java version "1.6.0_12"
Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)  

#### Remedial actions

If the version is less then 1.6, install the newest version of Java onto the system. If you instead see something like

java: Command not found  

make sure that java is installed on your machine, and that your PATH variable contains the path to the java executables.

On a Mac running OS X 10.5+, you may need to run /Applications/Utilities/Java Preferences.app and drag Java SE 6 to the top to make your machine run version 1.6, even if it has been installed.

Created 2012-07-24 21:24:42 | Updated 2014-07-28 22:32:00 | Tags: install java

#### Objective

Test that the GATK is correctly installed, and that the supporting tools like Java are in your path.

#### Prerequisites

• Basic familiarity with the command-line environment
• Understand what is a PATH variable

#### Steps

1. Invoke the GATK usage/help message
2. Troubleshooting

### 1. Invoke the GATK usage/help message

The command we're going to run is a very simple command that asks the GATK to print out a list of available command-line arguments and options. It is so simple that it will ALWAYS work if your GATK package is installed correctly.

Note that this command is also helpful when you're trying to remember something like the right spelling or short name for an argument and for whatever reason you don't have access to the web-based documentation.

#### Action

Type the following command:

java -jar <path to GenomeAnalysisTK.jar> --help

replacing the <path to GenomeAnalysisTK.jar> bit with the path you have set up in your command-line environment.

#### Expected Result

You should see usage output similar to the following:

usage: java -jar GenomeAnalysisTK.jar -T <analysis_type> [-I <input_file>] [-L
<intervals>] [-R <reference_sequence>] [-B <rodBind>] [-D <DBSNP>] [-H
<hapmap>] [-hc <hapmap_chip>] [-o <out>] [-e <err>] [-oe <outerr>] [-A] [-M
<maximum_reads>] [-sort <sort_on_the_fly>] [-compress <bam_compression>] [-fmq0] [-dfrac
<downsample_to_fraction>] [-dcov <downsample_to_coverage>] [-S
<validation_strictness>] [-U] [-P] [-dt] [-tblw] [-nt <numthreads>] [-l
<logging_level>] [-log <log_to_file>] [-quiet] [-debug] [-h]
-T,--analysis_type <analysis_type>                     Type of analysis to run
-I,--input_file <input_file>                           SAM or BAM file(s)
-L,--intervals <intervals>                             A list of genomic intervals over which
to operate. Can be explicitly specified
on the command line or in a file.
-R,--reference_sequence <reference_sequence>           Reference sequence file
-B,--rodBind <rodBind>                                 Bindings for reference-ordered data, in
the form <name>,<type>,<file>
-D,--DBSNP <DBSNP>                                     DBSNP file
-H,--hapmap <hapmap>                                   Hapmap file
-hc,--hapmap_chip <hapmap_chip>                        Hapmap chip file
-o,--out <out>                                         An output file presented to the walker.
Will overwrite contents if file exists.
-e,--err <err>                                         An error output file presented to the
walker. Will overwrite contents if file
exists.
-oe,--outerr <outerr>                                  A joint file for 'normal' and error
output presented to the walker. Will
overwrite contents if file exists.

...

If you see this message, your GATK installation is ok. You're good to go! If you don't see this message, and instead get an error message, proceed to the next section on troubleshooting.

### 2. Troubleshooting

Let's try to figure out what's not working.

#### Action

First, make sure that your Java version is at least 1.7, by typing the following command:

java -version

#### Expected Result

You should see something similar to the following text:

java version "1.7.0_12"
Java(TM) SE Runtime Environment (build 1.7.0_12-b04)
Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)  

#### Remedial actions

If the version is less then 1.7, install the newest version of Java onto the system. If you instead see something like

java: Command not found  

make sure that java is installed on your machine, and that your PATH variable contains the path to the java executables.

No articles to display.

Created 2016-01-20 16:11:48 | Updated | Tags: install picard failure htsjdk

(If this is the wrong place to post this, please tell me where to go)

I am new to GATK, trying to build Picard because I apparently will need to build Picard sequence directories for my genomes of interest.

I have installed Java6, Java8, ant, and HTSJDK. I have downloaded Picard 2a49ee2.

Following instructions at broadinstitute.github.io/picard/building.html, I have defined JAVA6_HOME, cd'd into broadinstitute-picard-2a49ee2, and when I run ant -lib lib/ant package-commands I get this message (relevant part only) "BUILD FAILED ... Basedir .../broadinstitute-picard-2a49ee2/htsjdk does not exist"

So it seems I need to do something for the build to know where my HTSJDK lives. Or did my download fail? Am I supposed to make a symbolic link from .../broadinstitute-picard-2a49ee2/htsjdk over to someplace in my HTSJDK directory? If so, to where?

Thanks for any help, Bob H

Created 2014-07-01 11:55:27 | Updated | Tags: install oncotator config

I have installed oncotator 1.2.6 in Ubuntu 14.04. It installed successfully but when I am running it throws following error:

Command: oncotator -i MAFLITE --db-dir /home/oncotator_1.2.6.0/oncotator_v1_ds_June112014/ -o TCGAMAF input.maflite output.maf hg19

Verbose mode on Path: ['/usr/local/bin', '/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.9.4-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/shove-0.5.6-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/Cython-0.20.2-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/PyVCF-0.6.7-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/biopython-1.64-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/pandas-0.14.0-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/stuf-0.9.4-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/futures-2.1.6-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/pytz-2014.4-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/python_dateutil-2.2-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/parse-1.4.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/Oncotator-v1.2.6.0-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/leveldb-0.193-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/natsort-3.3.0-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/python_memcached-1.53-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/nose-1.3.3-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/pysam-0.7.5-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/bcbio_gff-0.4-py2.7.egg', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client']

2014-07-01 17:14:15,384 INFO [oncotator.Oncotator:236] Oncotator v1.2.6.0 2014-07-01 17:14:15,385 INFO [oncotator.Oncotator:237] Args: Namespace(cache_url=None, dbDir='/home/adlab/ngs_soft/oncotator_1.2.6.0/oncotator_v1_ds_June112014/', default_cli=[], default_config=None, genome_build='hg19', infer_genotypes='false', input_file='0748.maflite', input_format='MAFLITE', log_name='oncotator.log', noMulticore=False, output_file='0748.maf', output_format='TCGAMAF', override_cli=[], override_config=None, prepend=False, read_only_cache=False, skip_no_alt=False, tx_mode='CANONICAL', verbose=5) 2014-07-01 17:14:15,385 INFO [oncotator.Oncotator:238] Log file: /media/adlab/Data/cervical_final/cervical_Reshma/SNP/T-N_VCF/corrected_SNP/coverage_3x_filter_VCF/oncotator_input_GATK/oncotator.log 2014-07-01 17:14:15,385 WARNING [oncotator.utils.ConfigUtils:197] Could not find config file (maflite_input.config). Trying configs/ prepend. Traceback (most recent call last): File "/usr/local/bin/oncotator", line 9, in load_entry_point('Oncotator==v1.2.6.0', 'console_scripts', 'oncotator')() File "/usr/local/lib/python2.7/dist-packages/Oncotator-v1.2.6.0-py2.7.egg/oncotator/Oncotator.py", line 293, in main other_opts=determineOtherOptions(args, logger)) File "/usr/local/lib/python2.7/dist-packages/Oncotator-v1.2.6.0-py2.7.egg/oncotator/utils/OncotatorCLIUtils.py", line 342, in create_run_spec inputCreator = OncotatorCLIUtils.create_input_creator(inputFilename, inputFormat, genomeBuild, other_opts) File "/usr/local/lib/python2.7/dist-packages/Oncotator-v1.2.6.0-py2.7.egg/oncotator/utils/OncotatorCLIUtils.py", line 302, in create_input_creator inputCreator = inputCreatorDict[inputFormat][0](inputFilename, inputConfig, genome_build, input_creator_options) File "/usr/local/lib/python2.7/dist-packages/Oncotator-v1.2.6.0-py2.7.egg/oncotator/input/MafliteInputMutationCreator.py", line 104, in init for alt in self._alternativeDict[col]: KeyError: 'build'

Please suggest what is going wrong..!! Python packages were installed correctly using easy_install-2.7. Thanks in advance.

Created 2014-04-25 06:48:55 | Updated | Tags: install gatk newbie

Hi, I have just used Ubuntu. I'm want install GATK for my work. But i can't search anything about how to install GATK by Ubuntu Software Center or Teminal. Please help me. Thanks.

Created 2014-02-14 10:39:55 | Updated | Tags: install mutect github git bcel

Hi

I was trying to install the github version of mutect and I have some questions as well as a hope that people who had similar problems might get help from my endeavours.

I followed the instructions posted on the github page, however when I tried to build:

# build ant -Dexternal.dir='pwd'/../mutect-src -Dexecutable=mutect package

It told me I didnt have the correct bcel files in my ~/.ant/lib/:

The bcel jar can be found in the lib directory of a GATK clone after compiling, and the ant-apache-bcel jar can be downloaded from here: http://repo1.maven.org/maven2/ant/ant-apache-bcel/1.6.5/ant-apache-bcel-1.6.5.jar Please copy these two jar files to ~/.ant/lib/

I had already downloaded the ant-apache-bcel and put it there so I figured it must be the GATK clone lib. I compiled with ant dist clean but it failed and the created "lib" folder was empty. However it did create a "dist" folder and in there i found bcel-5.2.jar. I popped this in ~/.ant/lib/ and now mutect seems to build correctly using:

# build ant -Dexternal.dir='pwd'/../mutect-src -Dexecutable=mutect package

So to my questions.

1. Is this an OK way to build it? (Can I trust the program despite unorthodox installation procedure).

2. Howcome the mutect install instructions dont specifically mention where to find the apache bcel library (I would not have found it without the error message) and guides you to compile the gatk-protected to get the second jar file that you need? Also where to put them!?

Created 2013-11-05 19:32:18 | Updated | Tags: install mutect bug error

Hello,

I am attempting to install MuTect using the instructions on GitHub at https://github.com/broadinstitute/mutect.

At the last step, building with Ant, I get the following error:

[xslt] Processing /.../mutect-src/mutect/mutect.xml to /.../gatk-protected/dist/packages/mutect.xml
[xslt] /.../gatk-protected/public/packages/CreatePackager.xsl:15:42: Error! xsl:output is not allowed in this position in the stylesheet!

BUILD FAILED
/.../gatk-protected/build.xml:945: The following error occurred while executing this line:
/.../gatk-protected/dist/packages/mutect.xml:1: Premature end of file.

After looking through the build.xml file, I am wondering if this is a bug in the stylesheet.

$JAVA_HOME/bin/java -version java version "1.7.0_45" Java(TM) SE Runtime Environment (build 1.7.0_45-b18)$ANT_HOME/bin/ant -version
Apache Ant(TM) version 1.9.2 compiled on July 8 2013

java -jar /.../gatk-protected/dist/GenomeAnalysisTK.jar -version
2.7-1-g42d771f`

As you can see, GATK is building just fine, and it seems to be a problem with MuTect.

Thanks for any help!

Created 2012-11-13 23:31:33 | Updated 2013-01-07 19:44:34 | Tags: windows install cygwin

Hi ya'll

I don't have access to the specific instructions for installing GATK on a windows platform (i.e. using cygwin). If I could get permission or someone could walk me through this I would be grateful.

Best,
Jacob Zieve
UC Davis

Created 2012-10-05 23:02:30 | Updated 2013-01-07 20:31:33 | Tags: windows install

I have downloaded the .jar file and I have tried double clicking it but nothing happens. I also reinstalled java multipled times to no avail. How do I install this program?

Created 2012-08-14 22:38:42 | Updated 2012-08-14 22:38:42 | Tags: bzip2 install

The current download (v2.0-39) does not seem to be in bzip2 format. It is not being recognized by bzip2, gzip or tar. I tried downloading it repeatedly throughout the day and using the commandline as well as Ubuntu's Archive Manager. Is this a known issue with the current download or is this just me?

Thank you