Tagged with #parallel
0 documentation articles | 0 announcements | 2 forum discussions


No posts found with the requested search criteria.
No posts found with the requested search criteria.
Comments (6)

Trying to run

java -jar $GATKJAR -R $REF -T UnifiedGenotyper -I file1.bam -I file2.bam -I file3.bam -glm BOTH -o output.vcf.gz

gives an error like:

 ##### ERROR ------------------------------------------------------------------------------------------
 ##### ERROR A USER ERROR has occurred (version 2.4-9-g532efad): 
 ##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
 ##### ERROR Please do not post this error to the GATK forum
 ##### ERROR
 ##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
 ##### ERROR Visit our website and forum for extensive documentation and answers to 
 ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
 ##### ERROR
 ##### ERROR MESSAGE: There was a failure because temporary file /tmp/org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub1033673347640679118.tmp could not be found while running the GATK with more than one thread.  Possible causes for this problem include: your system's open file handle limit is too small, your output or temp directories do not have sufficient space, or just an isolated file system blip
 ##### ERROR ------------------------------------------------------------------------------------------

The file is actually there, and is gzip-compressed and vcf-formatted.

However, if I specify -o output.vcf instead of -o output.vcf.gz, then everything works. I suspect the problem is with the autodetection of the codec. In VariantContextWriterStorage, LocalParallelizationProblem is thrown not only if the tmp file cannot be found, but whenever a FeatureDescriptor cannot be found for the file.

So... It seems like compressed output cannot be used from threaded processing with UnifiedGenotyper. Is my assessment correct?

  1. A better error message would be helpful to prevent others from trying the same thing I did.
  2. It would be nice to be able to write compressed output from a threaded UnifiedGenotyper, perhaps: a) the temp file could be written uncompressed even though the final file will be compressed, or b) the Codec-detection could detect gzip-compressed files?
Comments (1)

I'm working with ReduceReads and would like to use it in some kind of parallel mode. The presentation mentions that a 50x way run may drastically reduce run time but I'm not sure how to invoke this. I tried -nt and it complained. Should I be giving it multiple intervals and merging? If so, how does it deal with edge variants?

Thanks.