I've noticed a change in the log files that now shows a period of "starting" before actually genotyping. This can be hours...or minutes... I looked in previous logs and saw that there aren't such line in the log, but the clock jumps a similar amount of time. I was wondering if this is a known phenomenon and what it the GATK doing during that time. Is there anything we can do to reduce that waiting period?
Here are examples:
old (calling with 4500 samples, glm BOTH):
INFO 12:21:37,515 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 1157.62 INFO 12:26:52,780 RMDTrackBuilder - Loading Tribble index from disk for file /seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.dbsnp.vcf WARN 12:26:54,098 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has UNBOUNDED but standard is A INFO 12:26:54,457 GenomeAnalysisEngine - Processing 43138 bp from intervals INFO 12:26:54,473 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 12:26:54,473 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 21:37:08,951 ProgressMeter - 17:26944207 0.00e+00 9.2 h 54587.4 w 0.0% Infinity w Infinity w INFO 21:40:40,002 ProgressMeter - 17:26945957 1.82e+02 9.2 h 301.8 w 0.9% 6.2 w 6.1 w INFO 21:41:53,697 ProgressMeter - 17:26946820 6.49e+02 9.2 h 84.8 w 1.5% 3.6 w 3.5 w
and here newer (calling 1500 sample, glm SNP):
INFO 18:42:42,025 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 24.55 INFO 18:42:47,648 RMDTrackBuilder - Loading Tribble index from disk for file /seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.dbsnp.vcf WARN 18:42:47,972 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has UNBOUNDED but standard is A INFO 18:42:48,192 GenomeAnalysisEngine - Processing 43138 bp from intervals INFO 18:42:48,206 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 18:42:48,206 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 18:43:18,301 ProgressMeter - starting 0.00e+00 30.1 s 49.8 w 100.0% 30.1 s 0.0 s INFO 18:43:48,310 ProgressMeter - starting 0.00e+00 60.1 s 99.4 w 100.0% 60.1 s 0.0 s INFO 18:44:18,344 ProgressMeter - starting 0.00e+00 90.1 s 149.0 w 100.0% 90.1 s 0.0 s INFO 18:44:48,353 ProgressMeter - starting 0.00e+00 2.0 m 198.7 w 100.0% 2.0 m 0.0 s INFO 18:45:18,363 ProgressMeter - starting 0.00e+00 2.5 m 248.3 w 100.0% 2.5 m 0.0 s INFO 18:45:48,373 ProgressMeter - starting 0.00e+00 3.0 m 297.9 w 100.0% 3.0 m 0.0 s INFO 18:46:18,382 ProgressMeter - starting 0.00e+00 3.5 m 347.5 w 100.0% 3.5 m 0.0 s INFO 18:46:48,391 ProgressMeter - starting 0.00e+00 4.0 m 397.1 w 100.0% 4.0 m 0.0 s INFO 18:47:18,401 ProgressMeter - starting 0.00e+00 4.5 m 446.7 w 100.0% 4.5 m 0.0 s INFO 18:47:48,411 ProgressMeter - starting 0.00e+00 5.0 m 496.4 w 100.0% 5.0 m 0.0 s INFO 18:48:18,426 ProgressMeter - starting 0.00e+00 5.5 m 546.0 w 100.0% 5.5 m 0.0 s INFO 18:48:48,436 ProgressMeter - starting 0.00e+00 6.0 m 595.6 w 100.0% 6.0 m 0.0 s INFO 18:49:18,446 ProgressMeter - starting 0.00e+00 6.5 m 645.2 w 100.0% 6.5 m 0.0 s INFO 18:49:48,455 ProgressMeter - starting 0.00e+00 7.0 m 694.9 w 100.0% 7.0 m 0.0 s INFO 18:50:18,465 ProgressMeter - starting 0.00e+00 7.5 m 744.5 w 100.0% 7.5 m 0.0 s INFO 18:50:48,475 ProgressMeter - starting 0.00e+00 8.0 m 794.1 w 100.0% 8.0 m 0.0 s INFO 18:51:18,484 ProgressMeter - starting 0.00e+00 8.5 m 843.7 w 100.0% 8.5 m 0.0 s INFO 18:51:48,510 ProgressMeter - starting 0.00e+00 9.0 m 893.4 w 100.0% 9.0 m 0.0 s INFO 18:52:18,520 ProgressMeter - starting 0.00e+00 9.5 m 943.0 w 100.0% 9.5 m 0.0 s INFO 18:52:48,530 ProgressMeter - starting 0.00e+00 10.0 m 992.6 w 100.0% 10.0 m 0.0 s INFO 18:53:18,541 ProgressMeter - starting 0.00e+00 10.5 m 1042.2 w 100.0% 10.5 m 0.0 s INFO 18:53:48,552 ProgressMeter - starting 0.00e+00 11.0 m 1091.8 w 100.0% 11.0 m 0.0 s INFO 18:54:18,561 ProgressMeter - starting 0.00e+00 11.5 m 1141.5 w 100.0% 11.5 m 0.0 s INFO 18:54:48,570 ProgressMeter - starting 0.00e+00 12.0 m 1191.1 w 100.0% 12.0 m 0.0 s INFO 18:55:18,579 ProgressMeter - starting 0.00e+00 12.5 m 1240.7 w 100.0% 12.5 m 0.0 s INFO 18:55:48,590 ProgressMeter - 18:12725582 1.00e+01 13.0 m 129.0 w 0.5% 45.4 h 45.2 h
I've also noticed that output seems to be quite wrong about how much work it has left to do...perhaps these two things are related?
I would be grateful for any information.
INFO 17:40:38,765 ProgressMeter - chrX:147886444 3.21e+07 5.5 h 10.3 m 97.9% 5.6 h 7.1 m INFO 17:41:08,775 ProgressMeter - chrX:151653849 3.21e+07 5.5 h 10.3 m 98.0% 5.6 h 6.7 m INFO 17:41:38,785 ProgressMeter - chrX:153812787 3.22e+07 5.5 h 10.3 m 98.1% 5.6 h 6.5 m
java.lang.NoClassDefFoundError: net/sf/samtools/util/CloserUtil
at net.sf.picard.util.PeekableIterator.close(PeekableIterator.java:46)
at net.sf.picard.sam.MergingSamRecordIterator.addIfNotEmpty(MergingSamRecordIterator.java:169)
at net.sf.picard.sam.MergingSamRecordIterator.next(MergingSamRecordIterator.java:125)
at net.sf.picard.sam.MergingSamRecordIterator.next(MergingSamRecordIterator.java:39)
at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:100)
at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:84)
at org.broadinstitute.sting.gatk.datasources.reads.SAMDataSource$ReleasingIterator.next(SAMDataSource.java:1091)
at org.broadinstitute.sting.gatk.datasources.reads.SAMDataSource$ReleasingIterator.next(SAMDataSource.java:1057)
at org.broadinstitute.sting.gatk.filters.CountingFilteringIterator.getNextRecord(CountingFilteringIterator.java:105)
at org.broadinstitute.sting.gatk.filters.CountingFilteringIterator.next(CountingFilteringIterator.java:81)
at org.broadinstitute.sting.gatk.filters.CountingFilteringIterator.next(CountingFilteringIterator.java:41)
at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:100)
at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:84)
at org.broadinstitute.sting.gatk.iterators.VerifyingSamIterator.next(VerifyingSamIterator.java:24)
at org.broadinstitute.sting.gatk.iterators.VerifyingSamIterator.next(VerifyingSamIterator.java:12)
at org.broadinstitute.sting.utils.baq.ReadTransformingIterator.next(ReadTransformingIterator.java:36)
at org.broadinstitute.sting.utils.baq.ReadTransformingIterator.next(ReadTransformingIterator.java:15)
at net.sf.picard.util.PeekableIterator.advance(PeekableIterator.java:71)
at net.sf.picard.util.PeekableIterator.next(PeekableIterator.java:57)
at org.broadinstitute.sting.gatk.datasources.reads.ReadShard.fill(ReadShard.java:135)
at org.broadinstitute.sting.gatk.datasources.reads.ReadShardBalancer$1.advance(ReadShardBalancer.java:153)
at org.broadinstitute.sting.gatk.datasources.reads.ReadShardBalancer$1.next(ReadShardBalancer.java:116)
at org.broadinstitute.sting.gatk.datasources.reads.ReadShardBalancer$1.next(ReadShardBalancer.java:75)
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:65)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)
Caused by: java.lang.ClassNotFoundException: net.sf.samtools.util.CloserUtil
at java.net.URLClassLoader$1.run(URLClassLoader.java:199)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 29 more
Caused by: java.util.zip.ZipException: error reading zip file
at java.util.zip.ZipFile.read(Native Method)
at java.util.zip.ZipFile.access$1200(ZipFile.java:29)
at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:447)
at java.util.zip.ZipFile$1.fill(ZipFile.java:230)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:141)
at java.util.jar.Manifest$FastInputStream.fill(Manifest.java:422)
at java.util.jar.Manifest$FastInputStream.readLine(Manifest.java:358)
at java.util.jar.Manifest$FastInputStream.readLine(Manifest.java:390)
at java.util.jar.Attributes.read(Attributes.java:359)
at java.util.jar.Manifest.read(Manifest.java:182)
at java.util.jar.Manifest.
Any Idea what is the solution??
Hello,
I try to run the realigner target creator with the data divided on chromosomes. It seems to run smooth but the runtime for chromosome 1 seems to never end. I have checked my bam files and as expected the one with chr 1 is a little bit bigger than chr 2 but nothing proportional to the differences in predicted runtime. The version i use is GATK 2.3.0 and this is how i run it:
java -Xmx12g -jar programs/GenomeAnalysisTK-2.3-0/GenomeAnalysisTK.jar -l INFO -T VariantRecalibrator -R Homo_sapiens.GRCh37.57_dna_concat.fa -recalFile allchr_varrecal_BOTH_comb_ref.intervals -rscriptFile 15_allchr_varrecal_BOTH_comb_ref.intervals.plots.R -tranchesFile allchr_varrecal_BOTH_comb_ref.intervals.tranches -resource:hapmap,VCF,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.vcf -resource:omni,VCF,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.sites.vcf -resource:dbsnp,VCF,known=true,training=false,truth=false,prior=8.0 dbsnp_135.b37.vcf -resource:mills,VCF,known=true,training=true,truth=true,prior=12.0 Mills_and_1000G_gold_standard.indels.hg19.sites.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ --mode BOTH -nt 8 -input allchr_real_recal_resrt_raw_BOTH_comb_ref.vcf -L Agilent_SureSelect.V4.GRCh37.70_targets_nochr.bed.pad100.interval_list --pedigreeValidationType SILENT --pedigree my_fam.fam
predicted runtime looks like:
INFO 15:11:33,778 ProgressMeter - 1:33587200 3.36e+07 2.3 h 4.1 m 1.1% 8.8 d 8.7 d INFO 15:11:33,778 ProgressMeter - 11:73495886 1.82e+09 2.3 h 4.5 s 61.0% 3.7 h 87.4 m INFO 15:11:33,778 ProgressMeter - 10:6623707 1.68e+09 2.3 h 4.9 s 54.5% 4.2 h 114.3 m INFO 15:11:33,778 ProgressMeter - 11:572298 1.82e+09 2.3 h 4.5 s 58.7% 3.9 h 96.4 m INFO 15:11:33,779 ProgressMeter - 11:122078003 1.82e+09 2.3 h 4.5 s 62.6% 3.6 h 81.7 m INFO 15:11:53,780 ProgressMeter - 11:10589211 1.82e+09 2.3 h 4.5 s 59.0% 3.9 h 95.3 m INFO 15:11:53,780 ProgressMeter - 11:105031869 1.82e+09 2.3 h 4.5 s 62.1% 3.7 h 83.9 m INFO 15:11:53,781 ProgressMeter - 11:46994477 1.82e+09 2.3 h 4.5 s 60.2% 3.8 h 90.8 m INFO 15:12:03,779 ProgressMeter - 11:8677862 1.82e+09 2.3 h 4.5 s 58.9% 3.9 h 95.7 m INFO 15:12:03,779 ProgressMeter - 11:130690257 1.82e+09 2.3 h 4.5 s 62.9% 3.6 h 81.1 m INFO 15:12:13,779 ProgressMeter - 11:84816026 1.82e+09 2.3 h 4.5 s 61.4% 3.7 h 86.4 m INFO 15:12:13,779 ProgressMeter - 10:17039143 1.68e+09 2.3 h 4.9 s 54.8% 4.2 h 113.3 m INFO 15:12:23,780 ProgressMeter - 11:18556991 1.82e+09 2.3 h 4.5 s 59.3% 3.9 h 94.6 m INFO 15:12:23,780 ProgressMeter - 11:113365440 1.82e+09 2.3 h 4.5 s 62.3% 3.7 h 83.2 m INFO 15:12:23,782 ProgressMeter - 11:55284794 1.82e+09 2.3 h 4.5 s 60.4% 3.8 h 90.1 m INFO 15:12:33,780 ProgressMeter - 1:33783808 3.38e+07 2.3 h 4.1 m 1.1% 8.8 d 8.7 d INFO 15:12:33,780 ProgressMeter - 12:3311608 1.95e+09 2.3 h 4.2 s 63.1% 3.6 h 80.5 m INFO 15:12:43,779 ProgressMeter - 11:93292307 1.82e+09 2.3 h 4.6 s 61.7% 3.7 h 85.8 m INFO 15:12:43,780 ProgressMeter - 10:24841443 1.68e+09 2.3 h 4.9 s 55.1% 4.2 h 112.5 m INFO 15:12:43,780 ProgressMeter - 11:19408689 1.82e+09 2.3 h 4.6 s 59.3% 3.9 h 94.8 m INFO 15:12:53,965 ProgressMeter - 11:26512308 1.82e+09 2.3 h 4.6 s 59.5% 3.9 h 94.0 m INFO 15:12:53,965 ProgressMeter - 11:121738354 1.82e+09 2.3 h 4.6 s 62.6% 3.7 h 82.6 m INFO 15:12:53,965 ProgressMeter - 11:63468191 1.82e+09 2.3 h 4.6 s 60.7% 3.8 h 89.4 m INFO 15:13:03,781 ProgressMeter - 12:11960820 1.95e+09 2.3 h 4.3 s 63.4% 3.6 h 79.8 m INFO 15:13:13,780 ProgressMeter - 11:101551879 1.82e+09 2.3 h 4.6 s 61.9% 3.7 h 85.1 m INFO 15:13:13,780 ProgressMeter - 10:32238974 1.68e+09 2.3 h 4.9 s 55.3% 4.2 h 111.8 m INFO 15:13:13,780 ProgressMeter - 11:27204534 1.82e+09 2.3 h 4.6 s 59.5% 3.9 h 94.1 m INFO 15:13:23,966 ProgressMeter - 11:71529440 1.82e+09 2.3 h 4.6 s 61.0% 3.8 h 88.8 m INFO 15:13:23,968 ProgressMeter - 11:34170083 1.82e+09 2.3 h 4.6 s 59.8% 3.9 h 93.4 m INFO 15:13:23,977 ProgressMeter - 11:129889715 1.82e+09 2.3 h 4.6 s 62.9% 3.7 h 81.9 m INFO 15:13:33,781 ProgressMeter - 1:33980416 3.40e+07 2.3 h 4.1 m 1.1% 8.8 d 8.7 d
Any suggestions?
Regards,
Måns