When trying to run the UnifiedGenotyper, I keep getting the following
ERROR MESSAGE: There was a failure because temporary file /tmp/org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub7040175770023361502.tmp could not be found while running the GATK with more than one thread. Possible causes for this problem include: your system's open file handle limit is too small, your output or temp directories do not have sufficient space, or just an isolated file system blip
About the suggested causes: * The system's open file handle limit is set to 4827982, I doubt that this is exhausted. * On the /tmp/ file system, there is 200GB of free space; each GATK run seems to use less that 1/1000 of that. * I have no idea what a file system blip would be, but apparently it occurs every time I run the UnifiedGenotyper. Any idea why this would be, and how it could be avoided?
Or could there be still a different reason for the error?
Thanks, Alex
Trying to run
java -jar $GATKJAR -R $REF -T UnifiedGenotyper -I file1.bam -I file2.bam -I file3.bam -glm BOTH -o output.vcf.gz
gives an error like:
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 2.4-9-g532efad):
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: There was a failure because temporary file /tmp/org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub1033673347640679118.tmp could not be found while running the GATK with more than one thread. Possible causes for this problem include: your system's open file handle limit is too small, your output or temp directories do not have sufficient space, or just an isolated file system blip
##### ERROR ------------------------------------------------------------------------------------------
The file is actually there, and is gzip-compressed and vcf-formatted.
However, if I specify -o output.vcf instead of -o output.vcf.gz, then everything works. I suspect the problem is with the autodetection of the codec. In VariantContextWriterStorage, LocalParallelizationProblem is thrown not only if the tmp file cannot be found, but whenever a FeatureDescriptor cannot be found for the file.
So... It seems like compressed output cannot be used from threaded processing with UnifiedGenotyper. Is my assessment correct?