Tagged with #commandlinegatk
2 documentation articles | 1 announcement | 22 forum discussions

Comments (0)


This document describes how GATK commands are structured and how to add arguments to basic command examples.

Basic java syntax

Commands for GATK always follow the same basic syntax:

java [Java arguments] -jar GenomeAnalysisTK.jar [GATK arguments]

The core of the command is java -jar GenomeAnalysisTK.jar, which starts up the GATK program in a Java Virtual Machine (JVM). Any additional java-specific arguments (such as -Xmx to increase memory allocation) should be inserted between java and -jar, like this:

java -Xmx4G -jar GenomeAnalysisTK.jar [GATK arguments]

The order of arguments between java and -jar is not important.

GATK arguments

There are two universal arguments that are required for every GATK command (with very few exceptions, the clp-type utilities), -R for Reference (e.g. -R human_b37.fasta) and -T for Tool name (e.g. -T HaplotypeCaller).

Additional arguments fall in two categories:

  • Engine arguments like -L (for specifying a list of intervals) which can be given to all tools and are technically optional but may be effectively required at certain steps for specific analytical designs (e.g. the -L argument for calling variants on exomes);

  • Tool-specific arguments which may be required, like -I (to provide an input file containing sequence reads to tools that process BAM files) or optional, like -alleles (to provide a list of known alleles for genotyping).

The ordering of GATK arguments is not important, but we recommend always passing the tool name (-T) and reference (-R) first for consistency. It is also a good idea to consistently order arguments by some kind of logic in order to make it easy to compare different commands over the course of a project. It’s up to you to choose what that logic should be.

All available engine and tool-specific arguments are listed in the tool documentation section. Arguments typically have both a long name (prefixed by --) and a short name (prefixed by -). The GATK command line parser recognizes both equally, so you can use whichever you prefer, depending on whether you prefer commands to be more verbose or more succinct.

Finally, a note about flags. Flags are arguments that have boolean values, i.e. TRUE or FALSE. They are typically used to enable or disable specific features; for example, --keep_program_records will make certain GATK tools output additional information in the BAM header that would be omitted otherwise. In GATK, all flags are set to FALSE by default, so if you want to set one to TRUE, all you need to do is add the flag name to the command. You don't need to specify an actual value.

Examples of complete GATK command lines

This is a very simple command that runs HaplotypeCaller in default mode on a single input BAM file containing sequence data and outputs a VCF file containing raw variants.

java -Xmx4G -jar GenomeAnalysisTK.jar -R human_b37.fasta -T HaplotypeCaller -I sample1.bam -o raw_variants.vcf

If the data is from exome sequencing, we should additionally provide the exome targets using the -L argument:

java -Xmx4G -jar GenomeAnalysisTK.jar -R human_b37.fasta -T HaplotypeCaller -I sample1.bam -o raw_variants.vcf -L exome_intervals.list

If we just want to genotype specific sites of interest using known alleles based on results from a previous study, we can change the HaplotypeCaller’s genotyping mode using -gt_mode, provide those alleles using -alleles, and restrict the analysis to just those sites using -L:

java -Xmx4G -jar GenomeAnalysisTK.jar -R human_b37.fasta -T HaplotypeCaller -I sample1.bam -o raw_variants.vcf -L known_alleles.vcf -alleles known_alleles.vcf -gt_mode GENOTYPE_GIVEN_ALLELES

For more examples of commands and for specific tool commands, see the tool documentation section.

Comments (0)

A new tool has been released!

Check out the documentation at CommandLineGATK.

Comments (0)

I'm not sure why it hadn't occurred to us to do this before, but we've finally done it: an FAQ article that formally explains how GATK commands are structured, what are the basic types of arguments, and how to string them all together.

We realized that command structure requirements can be confusing, if you are new to command line programs, if only because so many toolkits use fairly different ones. For example, Picard tools (which are also developed at the Broad!) have separate jar files for each tool in the toolkit, while GATK has one jar file containing all the tools. The Picard syntax for passing argument values is also different; they use = to join the argument name and value, while GATK commands just take a space.

So if that's something you need help with, check out the doc! We'd love to hear from people who are new to GATK about whether this is helpful and how we can improve it further.

Comments (2)

I'm encountering a problem similar to what I experienced with a mismatch between reference and cosmic files. Specifically, I created .bam files using human_g1k_v3.fasta, which reference coordinates as 1...22,X,Y etc. When I try to run UnifiedGenotyper with this reference file, the .bam file I created, and the dbsnp_137.b37.vcf, I get an error on account of the fact that snp coordinates are listed as chr1...chr22,chrX,chrY etc. Other than writing a script to remove all occurrences of "chr" is there another way to get around this problem, i.e. a dbsnp reference file that has the desired coordinates without the "chr"?

I resolved the problem with reference/cosmic by finding a cosmic file with consistent notation, but can't find a similar fix for this one. I'd appreciate any suggestions.

Comments (3)

Hi, I hope this is a quick question; but does using the 'include intervals' command line option '-L' only include the region specified?

For instance if I have an file that includes reads for chromosomes 1,2,6,X,and Y and I specifiy "-L 6", will the walker only process chromosome 6, or will it include the rest of my data as well?

Thank you for the clarification!

Comments (3)

Hi everyone,

I'm using GATK Haplotype Caller and recently I read a document about optimization and GATK evoking GPU, IBM Power8, etc... Conclusions are that GATK could run faster than the actual implementation. In this document you suggest a C or C++ development in parallel of the actual Java implementation. And I wonder if you have made any progress so far and if you have planned a release? I'm also interested to know about the technology you're considering for GATK HC : C, C++, MPI, OpenMP, GPU...?



Comments (2)


I have used the following commands using DepthOfCoverage tool with two different bed files:

  java -jar GenomeAnalysisTK.jar -T DepthOfCoverage -R ucsc_hg19.fa -I WT_recalibrated.bam -L coverage_summary.bed -ct 1 -ct 10 -ct 20 -ct 30 -ct 50 -ct 100 -o WT_cov

The line count for the input and output:

$wc -l WT_cov.sample_interval_summary
4988 WT_cov.sample_interval_summary
$ wc -l coverage_summary.bed 
10585 coverage_summary.bed

In the other case:

  java -jar GenomeAnalysisTK.jar -T DepthOfCoverage -R ucsc_hg19.fa -I WT_recalibrated.bam -L exon.bed -ct 1 -ct 10 -ct 20 -ct 30 -ct 50 -ct 100 -o WT_exon

Line count for the input and output:

 $ wc -l WT_exon.sample_interval_summary 
 5065 WT_exon.sample_interval_summary
 $ wc -l exon.bed 
 5065 exon.bed

The input in both the cases is of the standard format as shown below:

 chr1    6529578 6529755  
 chr1    6530273 6530442
 chr1    6530543 6530721
 chr1    6530773 6530980 
 chr1    6531028 6531730
 chr1    6531768 6531914
 chr1    6532563 6532713
 chr1    6533023 6533273

Could anyone help to interpret the discrepancy between number of target regions in bed file and _interval_summary file in the above two cases?

Comments (2)

Hi all, I tried to apply the following command to my raw vcf file to filter it with the command: java -Xmx30g -jar ../GATK/GenomeAnalysisTK.jar -R ../ref.fa -T VariantFiltration --filterExpression " QD < 20.0 || ReadPosRankSum < -8.0 || FS > 10.0 || QUAL < $MEANQUAL || MQ <30.0 || DP< 10.0 " --filterName LowQualFilter --missingValuesInExpressionsShouldEvaluateAsFailing --variant ../s1.raw.vcf --logging_level ERROR -o ../s1.makered.raw.vcf

grep -v "Filter" s1.makered.raw.vcf >s1.flt.vcf

After that, I checked the result file s1.flt.vcf and found the following makered "PASS" .Obviously, the command doesn't work as ‘DP=8“ should be makered "LowQualFiter".

Chr01 231575 . A G 241.78 PASS AC=2;AF=1.00;AN=2;DP=8;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=29.00;MQ0=0;QD=30.22 GT:AD:DP:GQ:PL 1/1:0,8:8:24:270,24,0 Chr01 237476 . T C 238.78 PASS AC=2;AF=1.00;AN=2;DP=8;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=29.00;MQ0=0;QD=29.85 GT:AD:DP:GQ:PL 1/1:0,8:8:24:267,24,0

There is no error reported.Any suggestion will be appreciated.

Comments (1)


I'm trying to call variants on WGS data using the following command (on a high-performance cluster using 4 cores per job) :

java -Xmx6G -jar $CLASSPATH -T HaplotypeCaller --dbsnp GRCh37-lite.vcf -nct 4 -R GRCh37-lite.fa -I /user/data/gent/gvo000/gvo00027/vsc40035/StJude/001/001_D.bam -maxAltAlleles 10 --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o 001_D.vcf

$CLASSPATH contains the location of the .jar file.

For 20 out of 70 samples the script ended without a problem, for the other 50 samples the same error messages was returned, given below. Can you tell what's going wrong?

Kind regards, Steve

INFO 07:28:05,514 ProgressMeter - 2:92269030 2.49e+08 3.1 h 44.0 s 11.0% 27.9 h 24.8 h INFO 07:29:05,515 ProgressMeter - 2:92309304 2.49e+08 3.1 h 44.0 s 11.0% 28.1 h 25.0 h INFO 07:30:05,516 ProgressMeter - 2:92320228 2.49e+08 3.1 h 44.0 s 11.0% 28.2 h 25.1 h INFO 07:30:06,052 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NullPointerException at org.broadinstitute.sting.gatk.walkers.haplotypecaller.PairHMMLikelihoodCalculationEngine.computeDiploidHaplotypeLikelihoods(PairHMMLikelihoodCalculationEngine.java:443) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.PairHMMLikelihoodCalculationEngine.computeDiploidHaplotypeLikelihoods(PairHMMLikelihoodCalculationEngine.java:417) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.calculateGLsForThisEvent(GenotypingEngine.java:385) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.assignGenotypeLikelihoods(GenotypingEngine.java:222) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:880) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:141) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler$ReadMapReduceJob.run(NanoScheduler.java:471) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------
Comments (19)


I have used CombineVariants to combine variants from GATK and samtools as shown below:

java -jar GenomeAnalysisTK.jar -T CombineVariants -R ref.fa --variant:GatkSNP GATKsnp.vcf --variant:GatkINDEL GATKind.vcf --variant:SamSNP Samsnp.vcf --variant:SamINDEL Samind.vcf -o allvar.vcf -genotypeMergeOptions PRIORITIZE -priority GatkSNP,GatkINDEL,SamSNP,SamINDEL --filteredrecordsmergetype KEEP_UNCONDITIONAL

This merges all the variants. However, with the above command, i do get the variants present in both GATK and samtools emitted from samtools.

I would like to get all the variants such that:

  • variants present in both GATK and samtools emitted from GATK vcf files
  • variants in only GATK
  • variants in only samtools

could someone suggest any ideas or of there is something to be fixed in the command.


Comments (3)

Dear GATK help team,

I have a cut chromosome file (cur 17) in which I have processed through sorting, alignment, adding headers, and even through the realignertargetcreator. Yet, when I would like to call my indels from the Indel realigned. I have received an error. ERROR MESSAGE: Badly formed genome loc: Parameters to GenomeLocParser are incorrect:The contig index 0 is bad, doesn't equal the contig index 17 of the contig from a string chr17

I have cut these chromosomes and processed the chr17 first since that is my region of interest, and did it since I thought it might save memory issues.

I am currently a newbie, have check the forum for help, yet only found one similar post with no solution. Please help-- stuck at this stage. My code for the index realigned is the following: java -jar /Users/yotsukurasohiya/build/softwares/GenomeAnalysisTK-3.2-2/GenomeAnalysisTK.jar -T IndelRealigner -R /Volumes/Pegasus/broadref/ucsc.hg19.fasta -I /Volumes/Pegasus/tmp/mardup.pregatk.bam -targetIntervals 2_target_intervals.list -known /Volumes/Pegasus/broadref/Mills_and_1000G_gold_standard.indels.hg19.vcf -known /Volumes/Pegasus/broadref/dbsnp_138.hg19.vcf -known /Volumes/Pegasus/broadref/1000G_phase1.snps.high_confidence.hg19.vcf -o 2_realigned_reads.bam

The heads that I have added are through picard softwares addorreplacereadgroups. SO=coordinate CREATE_INDEX=true SM=temp PL=Illumina PU=barcode LB=bar ID=id

Comments (2)

Using GATK on command-line the CatVariants command fails.

Program version: GATK 3.1-1-g07a4bf8.

ERROR MESSAGE: Invalid command line: Malformed walker argument: Could not find walker with name: CatVariants

Code to invoke:

java -jar GenomeAnalysisTK-3.1-1/GenomeAnalysisTK.jar -T CatVariant -R file.fasta

Note that in the current documentation for CatVariants the example lists the name as 'org.broadinstitute.sting.tools.CatVariants' rather than just CatVariants. Trying the listed string fails with the same error.

Comments (1)

Below is the command:

java -cp $CLASSPATH/GenomeAnalysisTK.jar org.broadinstitute.sting.tools.CatVariants \
-R GATK_ref/hg19.fasta \
-V ../GATK/VQSR/parallel_batch/raw.snps_indels-1.vcf \
-V ../GATK/VQSR/parallel_batch/raw.snps_indels-2.vcf \
-V ../GATK/VQSR/parallel_batch/raw.snps_indels-3.vcf \
-out ../GATK/VQSR/parallel_batch/combined_raw.snps_indels.vcf \
-log ../GATK/VQSR/parallel_batch/log/combined.log \

After this, the combined_raw.snps_indels.vcf file only contains the header from raw.snps_indels-1.vcf, what might be wrong?

Comments (9)

I'm running the latest GATK nightly build to process human exome-seq data (has 12 samples). It seemed be faster than the older version until I run the HaplotypeCaller. The run summary shows it will take 14 days to finish. I am wondering if there's anything in my below command: How to make it faster without losing data in the output?

java -Xmx10g -Djava.io.tmpdir=/temp/GATK_temp
-jar $CLASSPATH/GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-R ../GATK_ref/hg19.fasta \
-I ./compressedbam.list \
-L ../GATK_ref/hg19knownGene_UCSC_sorted.bed \
-log ../GATK/VQSR/log/HaplotypeCaller_20131018.log \
-o ../GATK/VQSR/raw.snps_indels.vcf
Comments (2)

If I put my input files as a list in the file named "input.list", how do I set the output names? or do I just need to set the output folder and the output file names will be automatically named?

Comments (4)

I started with BWA-MEM to do alignment, used Picard to process the .SAM files (converted to bam, reorder, addorreplacegroup, etc). The GATK version I'm using is version 2.5-2-gf57256b, I cannot run 2.6 because the server only has Java 6 and I cannot upgrade it to Java 7.

I got a huge stack of error message when I run this command line (RealignerTargetCrator):

java -Xmx2g -jar $CLASSPATH/GenomeAnalysisTK.jar \ -T RealignerTargetCreator \ -R /Volumes/files/Users/user1/GATK_ref/hg19.fasta \ -I sorted_Deduped_reorder_grp.bam \ -o ./GATK/forIndelRealigner.intervals>

The error messages are these (sorry, a lot): I don't know why GATK needs to connect to window server? what permission problem? I am using a Mac OS X built server (remote). Thank you

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.InternalError: Can't connect to window server - not enough permissions. at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1827) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1724) at java.lang.Runtime.loadLibrary0(Runtime.java:823) at java.lang.System.loadLibrary(System.java:1045) at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:50) at java.security.AccessController.doPrivileged(Native Method) at java.awt.Toolkit.loadLibraries(Toolkit.java:1605) at java.awt.Toolkit.(Toolkit.java:1627) at sun.awt.AppContext$2.run(AppContext.java:240) at sun.awt.AppContext$2.run(AppContext.java:226) at java.security.AccessController.doPrivileged(Native Method) at sun.awt.AppContext.initMainAppContext(AppContext.java:226) at sun.awt.AppContext.access$200(AppContext.java:112) at sun.awt.AppContext$3.run(AppContext.java:306) at java.security.AccessController.doPrivileged(Native Method) at sun.awt.AppContext.getAppContext(AppContext.java:287) at com.sun.jmx.trace.Trace.out(Trace.java:180) at com.sun.jmx.trace.Trace.isSelected(Trace.java:88) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.isTraceOn(DefaultMBeanServerInterceptor.java:1830) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:929) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:916) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:312) at com.sun.jmx.mbeanserver.JmxMBeanServer$2.run(JmxMBeanServer.java:1195) at java.security.AccessController.doPrivileged(Native Method) at com.sun.jmx.mbeanserver.JmxMBeanServer.initialize(JmxMBeanServer.java:1193) at com.sun.jmx.mbeanserver.JmxMBeanServer.(JmxMBeanServer.java:225) at com.sun.jmx.mbeanserver.JmxMBeanServer.(JmxMBeanServer.java:170) at com.sun.jmx.mbeanserver.JmxMBeanServer.newMBeanServer(JmxMBeanServer.java:1401) at javax.management.MBeanServerBuilder.newMBeanServer(MBeanServerBuilder.java:93) at javax.management.MBeanServerFactory.newMBeanServer(MBeanServerFactory.java:311) at javax.management.MBeanServerFactory.createMBeanServer(MBeanServerFactory.java:214) at javax.management.MBeanServerFactory.createMBeanServer(MBeanServerFactory.java:175) at sun.management.ManagementFactory.createPlatformMBeanServer(ManagementFactory.java:302) at java.lang.management.ManagementFactory.getPlatformMBeanServer(ManagementFactory.java:504) at org.broadinstitute.sting.gatk.executive.MicroScheduler.(MicroScheduler.java:222) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.(LinearMicroScheduler.java:70) at org.broadinstitute.sting.gatk.executive.MicroScheduler.create(MicroScheduler.java:169) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.createMicroscheduler(GenomeAnalysisEngine.java:443) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:272) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.5-2-gf57256b):
Comments (6)

Dear GATK Users,

Could anybody tell me how to identify the deletions from the bam file using GATK module?? Actually i used UnifiedGenotyper i am getting list like


gi|262 48155 . G A 80.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=0.103;DP=10;Dels=0.00;FS=0.000;HaplotypeScore =0.0000;MLEAC=1;MLEAF=0.500;MQ=28.61;MQ0=0;MQRankSum=-1.453;QD=8.08;ReadPosRankSum=-0.336 GT:AD:DP:GQ:PL 0/1:5,5:10:99:109,0,146

Thanks Sridhar

Comments (15)

Hi all: I find that among all the work flows of GATK http://www.broadinstitute.org/gatk/guide/topic?name=methods-and-workflows there are no workflows for RNA-seq analysis. I understand that GATK mainly focuses on variant calling, can anyone tell me how to use GATK for RNA-seq analysis?

thanks daniel

Comments (8)

Hello Team,

I am attempting to run GATK's PhasebyTransmission command to phase a vcf file contains a father, mother, son trio generated from complete genomics mkvcf command.

After creating the ped file and running the command I generate the error: "MESSAGE: BUG: Attempted to get likelihoods as strings and neither the vector nor the string is set!". I am not exactly sure what this means.

When I check my file and the documentation I am able to see that the 'GL' field is contained in the file, but could this not be the case? I have attached a few lines from the vcf I am using.

Any help with resolving the this issue would be of great help.

Thank you


Comments (8)

Hello, I`m new to GATK and Queue. I understand that we can write a QScript in Queue to generate separate GATK jobs and run them on a cluster of several nodes. Can we implement GATK or Queue on google hadoop?

Comments (3)

I got this error message, when trying to use a file to specify at which positions to emit variants:

ERROR MESSAGE: Couldn't read file /lustre/scratch109/sanger/tc9/agv/wgs/pipeline/union4x.positions because The interval file /lustre/scratch109/sanger/tc9/agv/wgs/pipeline/union4x.positions does not have one of the supported extensions (.bed, .list, .picard, .interval_list, or .intervals). Please rename your file with the appropriate extension. Is there a GATK page describing those 5 file formats? Some of them are unknown to me; e.g. .list.

I asked my question here, but please ignore it: http://gatkforums.broadinstitute.org/discussion/2219/l-option

Thanks a lot.

Also, the error message does not mention support for vcf files, but the documentation does. Are vcf files supported?

Comments (5)

hi all! I'm trying to complete my first GATK run, I'm doing the step in the "EXECUTION STEP" following section.

please tell me if the step execution are globally correct.


the step 4.1 isn't executed without -maxCycle 1500.

when try to execute 4.2 step I got the following error:

ERROR stack trace

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Key 1036 is too large for dimension 2 (max is 1001) at org.broadinstitute.sting.utils.collections.NestedIntegerArray.put(NestedIntegerArray.java:128) at org.broadinstitute.sting.utils.recalibration.RecalibrationReport.parseAllCovariatesTable(RecalibrationReport.java:157) at org.broadinstitute.sting.utils.recalibration.RecalibrationReport.(RecalibrationReport.java:68) at org.broadinstitute.sting.utils.recalibration.BaseRecalibration.(BaseRecalibration.java:74) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.setBaseRecalibration(GenomeAnalysisEngine.java:217) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:253) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.3-9-ge5ebf34):
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR MESSAGE: Key 1036 is too large for dimension 2 (max is 1001)
ERROR ------------------------------------------------------------------------------------------

---------------------------------------------------------------EXECUTION STEP---------------------------------------------------------


java -Xmx4g -Djava.io.tmpdir=/tmp -jar MarkDuplicates.jar INPUT=M9.bam OUTPUT=m9.marked.bam METRICS_FILE=metrics CREATE_INDEX=true VALIDATION_STRINGENCY=LENIENT



java -Xmx4g -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R ucsc.hg19.fasta -knowndbsnp_137.hg19.vcf -o m9.list -I m9.marked.bam


java -Xmx4g -Djava.io.tmpdir=/tmp -jar GenomeAnalysisTK.jar -I m9.marked.bam -R ucsc.hg19.fasta -T IndelRealigner -targetIntervals m9.list -known dbsnp_137.hg19.vcf -o m9.marked.realigned.bam


java -Xmx4g -jar GenomeAnalysisTK.jar -R ucsc.hg19.fasta -T ReduceReads -I m9.marked.realigned.bam -o m9.marked.realigned.reduce.bam


java -Djava.io.tmpdir=/tmp/flx-auswerter -Xmx4g -jar FixMateInformation.jar INPUT=m9.marked.realigned.reduce.bam OUTPUT=m9.marked.realigned.reduce.fixed.bam SO=coordinate VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=true



java -Xmx4g -jar GenomeAnalysisTK.jar -l INFO -R ucsc.hg19.fasta -knownSites dbsnp_137.hg19.vcf -I m9.marked.realigned.reduce.fixed.bam -T BaseRecalibrator -maxCycle 1500 -cov ReadGroupCovariate -cov QualityScoreCovariate -o m9.recal_data.grp

4.2 ***********************************

java -Xmx4g -jar GenomeAnalysisTK.jar -T PrintReads -R ucsc.hg19.fasta -I m9.marked.realigned.reduce.fixed.bam -BQSR m9.recal_data.grp -o m9.marked.realigned.reduce.fixed.recal.bam



java -Xmx4g -jar GenomeAnalysisTK.jar -nct 4 --num_threads 4 -glm BOTH -R ucsc.hg19.fasta -T UnifiedGenotyper --sample_ploidy 5 -I m9.marked.realigned.reduce.fixed.bam -D dbsnp_137.hg19.vcf -o m9.vcf -stand_call_conf 20.0 -stand_emit_conf 20.0
-A DepthOfCoverage -A AlleleBalance

Comments (1)

HI When I run Base recabrator with the following command:

java -Xmx4g -jar /usr/bin/GenomeAnalysisTK.jar -T BaseRecalibrator -I realignedBam.bam  -R /data1/human_g1k_v37.fasta --knownSites /data1/snp132.vcf -o recalibration_report.grp

I get the following error :

INFO  07:15:53,380 HttpMethodDirector - I/O exception (javax.net.ssl.SSLException) caught when processing request: Unrecognized SSL message, plaintext connection? 
INFO  07:15:53,380 HttpMethodDirector - Retrying request 
INFO  07:15:53,386 HttpMethodDirector - I/O exception (javax.net.ssl.SSLException) caught when processing request: Unrecognized SSL message, plaintext connection? 
INFO  07:15:53,387 HttpMethodDirector - Retrying request 
INFO  07:15:53,393 HttpMethodDirector - I/O exception (javax.net.ssl.SSLException) caught when processing request: Unrecognized SSL message, plaintext connection? 
INFO  07:15:53,393 HttpMethodDirector - Retrying request 
INFO  07:15:53,398 HttpMethodDirector - I/O exception (javax.net.ssl.SSLException) caught when processing request: Unrecognized SSL message, plaintext connection? 
INFO  07:15:53,398 HttpMethodDirector - Retrying request 
INFO  07:15:53,405 HttpMethodDirector - I/O exception (javax.net.ssl.SSLException) caught when processing request: Unrecognized SSL message, plaintext connection? 
INFO  07:15:53,405 HttpMethodDirector - Retrying request 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 2.0-34-g07bda93): 
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:
##### ERROR          Name        FeatureType   Documentation
##### ERROR          BCF2     VariantContext   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_bcf2_BCF2Codec.html
##### ERROR        BEAGLE      BeagleFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_beagle_BeagleCodec.html
##### ERROR           BED         BEDFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broad_tribble_bed_BEDCodec.html
##### ERROR      BEDTABLE       TableFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_table_BedTableCodec.html
##### ERROR EXAMPLEBINARY            Feature   http://www.broadinstitute.org/gatk/gatkdocs/org_broad_tribble_example_ExampleBinaryCodec.html
##### ERROR      GELITEXT    GeliTextFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broad_tribble_gelitext_GeliTextCodec.html
##### ERROR      OLDDBSNP    OldDbSNPFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broad_tribble_dbsnp_OldDbSNPCodec.html
##### ERROR     RAWHAPMAP   RawHapMapFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_hapmap_RawHapMapCodec.html
##### ERROR        REFSEQ      RefSeqFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_refseq_RefSeqCodec.html
##### ERROR     SAMPILEUP   SAMPileupFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_sampileup_SAMPileupCodec.html
##### ERROR       SAMREAD     SAMReadFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_samread_SAMReadCodec.html
##### ERROR         TABLE       TableFeature   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_table_TableCodec.html
##### ERROR           VCF     VariantContext   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_vcf_VCFCodec.html
##### ERROR          VCF3     VariantContext   http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_vcf_VCF3Codec.html
##### ERROR ------------------------------------------------------------------------------------------
Comments (1)

Hi I´ve a strange problem with the GATK. Everytime I try to run it my Console shows the following error Message.

30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR ------------------------------------------------------------------------------------------
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR A USER ERROR has occurred (version 2.1-13-g0f021e6): 
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR Please do not post this error to the GATK forum
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR Visit our website and forum for extensive documentation and answers to 
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR MESSAGE: Argument with name '--analysis_type' (-T) is missing.
30.10.12 15:10:06   [0x0-0xe20e2].com.apple.JarLauncher[1114]   ##### ERROR ------------------------------------------------------------------------------------------

Can you show me my mistakes please? With regards Oliver

Comments (2)


During running of the depthOfCoverage tool, I get the error: /tmp/RsQHCt1W: No space left on device

I have tried changing the TMPDIR environment variable (and exporting) but eventually I get the same error. Is there a way to change the temporary directory that GATK uses?

I'm running GATK v2.1-8-g5efb575 on a Linux system.

Thanks, Rick