I am running GATK on a shared LSF cluster but have unintentionally run afoul of the admins due to resource use. I am running the GATK commands with 'java -Xmx4g -jar $GATK' and am not using -nt or -nct, but CPU usage is exceeding the resources requested which are 1 vCore and 6G memory. I am now running the same commands on our lab server and do indeed see that CPU usage reaches as high as 800% at times. Is there a way to limit the CPU resources claimed by GATK tools, or alternatively a way to anticipate its actual needs so that I can request an appropriate number of CPUs?
Much thanks, Erik
This is not a question, per se - I suppose it's more of an observation.
We recently upgraded LSF on one of our clusters to v9.0.1, and quickly discovered that Queue can't submit jobs. The reaction was rather violent - the entire JVM crashed, and the stack trace showed it dying in lsb_submit(). We downgraded LSF to v8.3.0, and everything is working fine (so far).
I know Queue is compiled against the LSF v7.0.6 API, it would appear that it's not binary-compatible with LSF 9.x.
Hope this helps others in the future...
I had no problem to run GATK two weeks ago. But today, when I run the following GATK command, I got error message. It seems it cannot load library " liblsf.so". Please see below. Is there any change recently on GATK library?
2:15pm qyu@vbronze /bit/data01/projects/GC_coverage $ java -Xmx4g -Djava.io.tmpdir=/broad/hptmp/vdauwera -jar /humgen/gsa-scr1/vdauwera/gatk/walker/dist/Queue.jar -S /bit/data01/projects/GC_coverage/src/IntervalCovGG.scala -i /humgen/gsa-hpprojects/GATK/data/genes_of_interest.exon_targets.interval_list -b /bit/data01/projects/GC_coverage/testGCcoverage/data/DEV-2129.list -o /bit/data01/projects/GC_coverage/testGCcoverage/DEV-2129.gatkreport -m 16 -sc 100 -bsub -jobQueue priority -startFromScratch -run
The error shows:
ERROR 14:13:06,238 QGraph - Uncaught error running jobs. java.lang.UnsatisfiedLinkError: Unable to load library 'lsf': liblsf.so: cannot open shared object file: No such file or directory at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:163) at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:236) at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:199) at org.broadinstitute.sting.jna.lsf.v7_0_6.LibBat.<clinit>(LibBat.java:9 0) at org.broadinstitute.sting.queue.engine.lsf.Lsf706JobRunner$.<init>(Lsf 706JobRunner.scala:233) at org.broadinstitute.sting.queue.engine.lsf.Lsf706JobRunner$.<clinit>(L sf706JobRunner.scala) at org.broadinstitute.sting.queue.engine.lsf.Lsf706JobRunner.<init>(Lsf7 06JobRunner.scala:47) at org.broadinstitute.sting.queue.engine.lsf.Lsf706JobManager.create(Lsf 706JobManager.scala:35) at org.broadinstitute.sting.queue.engine.lsf.Lsf706JobManager.create(Lsf 706JobManager.scala:33) at org.broadinstitute.sting.queue.engine.QGraph.newRunner(QGraph.scala:6 32) at org.broadinstitute.sting.queue.engine.QGraph.runJobs(QGraph.scala:408 ) at org.broadinstitute.sting.queue.engine.QGraph.run(QGraph.scala:131) at org.broadinstitute.sting.queue.QCommandLine.execute(QCommandLine.scal a:127) at org.broadinstitute.sting.commandline.CommandLineProgram.start(Command LineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(Command LineProgram.java:146) at org.broadinstitute.sting.queue.QCommandLine$.main(QCommandLine.scala: 62) at org.broadinstitute.sting.queue.QCommandLine.main(QCommandLine.scala) Exception in thread "main" java.lang.UnsatisfiedLinkError: Unable to load librar y 'lsf': liblsf.so: cannot open shared object file: No such file or directory at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:163) at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:236) at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:199) at org.broadinstitute.sting.jna.lsf.v7_0_6.LibBat.<clinit>(LibBat.java:9 .........
I am testing queue scripts with new installed LSF v8.3. The test script is:
java -Djava.io.tmpdir=/tmp -jar jar2216/Queue.jar -S Queue-2.2-16-g9f648cb/resources/ExampleCountReads.scala -R Queue-2.2-16-g9f648cb/resources/exampleFASTA.fasta -I Queue-2.2-16-g9f648cb/resources/exampleBAM.bam --bsub -run
where I get error message as follows:
'java' '-Xmx1024m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/data/cmb/wxing/gatk/.queue/tmp' '-cp' '/data/cmb/wxing/gatk/jar2216/Queue.jar' 'org.broadinstitute.sting.gatk.
CommandLineGATK' '-T' 'CountReads' '-I' '/data/cmb/wxing/gatk/Queue-2.2-16-g9f648cb/resources/exampleBAM.bam' '-R' '/data/cmb/wxing/gatk/Queue-2.2-16-g9f648cb/resources/exampleFASTA.fasta'
java.lang.UnsatisfiedLinkError: Error looking up function 'ls_getLicenseUsage': /usr/local/lsf/8.3/linux2.6-glibc2.3-x86_64/lib/liblsf.so: undefined symbol: ls_getLicenseUsage
Any clues on the issue "java.lang.UnsatisfiedLinkError: Error looking up function 'ls_getLicenseUsage': /usr/local/lsf/8.3/linux2.6-glibc2.3-x86_64/lib/liblsf.so: ". Or anyone had similar problems?
Anyone think it could be the version of our LSF (v8.3) as the code seem based on version 706?
Many thanks, Wei