Tagged with #error
0 documentation articles | 1 announcement | 45 forum discussions


No posts found with the requested search criteria.

Created 2014-06-11 21:20:08 | Updated | Tags: best-practices bug error rnaseq topstory
Comments (2)

We discovered today that we made an error in the documentation article that describes the RNAseq Best Practices workflow. The error is not critical but is likely to cause an increased rate of False Positive calls in your dataset.

The error was made in the description of the "Split & Trim" pre-processing step. We originally wrote that you need to reassign mapping qualities to 60 using the ReassignMappingQuality read filter. However, this causes all MAPQs in the file to be reassigned to 60, whereas what you want to do is reassign MAPQs only for good alignments which STAR identifies with MAPQ 255. This is done with a different read filter, called ReassignOneMappingQuality. The correct command is therefore:

java -jar GenomeAnalysisTK.jar -T SplitNCigarReads -R ref.fasta -I dedupped.bam -o split.bam -rf ReassignOneMappingQuality -RMQF 255 -RMQT 60 -U ALLOW_N_CIGAR_READS

In our hands we see a bump in the rate of FP calls from 4% to 8% when the wrong filter is used. We don't see any significant amount of false negatives (lost true positives) with the bad command, although we do see a few more true positives show up in the results of the bad command. So basically the effect is to excessively increase sensitivity, at the expense of specificity, because poorly mapped reads are taken into account with a "good" mapping quality, where they would normally be discarded.

This effect will be stronger in datasets with lower overall quality, so your results may vary. Let us know if you observe any really dramatic effects, but we don't expect that to happen.

To be clear, we do recommend re-processing your data if you can, but if that is not an option, keep in mind how this affects the rate of false positive discovery in your data.

We apologize for this error (which has now been corrected in the documentation) and for the inconvenience it may cause you.


Created 2015-07-31 22:11:45 | Updated | Tags: commandlinegatk haplotypecaller gatk error
Comments (4)

Hello,

I am receiving the following error. I am working with SAM files that were exported from CLC, then edited with Picard-tools to addReadGroups. I am not sure if I need to add an additional step to solve this problem, I cannot find any documentation regarding this error.

Please let me know what I need to do to correct this issue.

Thank you!

gatk -T HaplotypeCaller -R spinach_assembly-repeatdetect_PACBIO_V1.3_formated_60.fa -I .sam.list -drf DuplicateRead --alleles Unfiltered_Spinach_PacBio_Reseq_12_Geno_Assay_SNP.fixed.noblanks.vcf --genotyping_mode GENOTYPE_GIVEN_ALLELES --output_mode EMIT_ALL_SITES -o output_raw_unfiltered_spinach_snps_gbs.vcf INFO 14:48:44,450 HelpFormatter - --------------------------------------------------------------------------------- INFO 14:48:44,453 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.4-46-gbc02625, Compiled 2015/07/09 17:38:12 INFO 14:48:44,454 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 14:48:44,454 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 14:48:44,458 HelpFormatter - Program Args: -T HaplotypeCaller -R spinach_assembly-repeatdetect_PACBIO_V1.3_formated_60.fa -I .sam.list -drf DuplicateRead --alleles Unfiltered_Spinach_PacBio_Reseq_12_Geno_Assay_SNP.fixed.noblanks.vcf --genotyping_mode GENOTYPE_GIVEN_ALLELES --output_mode EMIT_ALL_SITES -o output_raw_unfiltered_spinach_snps_gbs.vcf INFO 14:48:44,468 HelpFormatter - Executing as ahulse@jalapeno.genomecenter.ucdavis.edu on Linux 2.6.18-348.12.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13. INFO 14:48:44,469 HelpFormatter - Date/Time: 2015/07/31 14:48:44 INFO 14:48:44,469 HelpFormatter - --------------------------------------------------------------------------------- INFO 14:48:44,470 HelpFormatter - --------------------------------------------------------------------------------- INFO 14:48:45,102 GenomeAnalysisEngine - Strictness is SILENT INFO 14:48:45,385 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500 INFO 14:48:45,394 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 14:48:48,432 SAMDataSource$SAMReaders - Init 50 BAMs in last 3.04 s, 50 of 80 in 3.04 s / 0.05 m (16.46 tasks/s). 30 remaining with est. completion in 1.82 s / 0.03 m INFO 14:48:50,052 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 4.66 INFO 14:48:50,164 HCMappingQualityFilter - Filtering out reads with MAPQ < 20 INFO 14:48:54,742 RMDTrackBuilder - Writing Tribble index to disk for file /local/scratch/scratch/Amanda/Spinach_GBS/Unfiltered_Spinach_PacBio_Reseq_12_Geno_Assay_SNP.fixed.noblanks.vcf.idx INFO 14:48:58,784 GenomeAnalysisEngine - Preparing for traversal over 80 BAM files INFO 14:49:00,054 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A BAM ERROR has occurred (version 3.4-46-gbc02625):
ERROR
ERROR This means that there is something wrong with the BAM file(s) you provided.
ERROR The error message below tells you what is the problem.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum until you have followed these instructions:
ERROR - Make sure that your BAM file is well-formed by running Picard's validator on it
ERROR (see http://picard.sourceforge.net/command-line-overview.shtml#ValidateSamFile for details)
ERROR - Ensure that your BAM index is not corrupted: delete the current one and regenerate it with 'samtools index'
ERROR
ERROR MESSAGE: Cannot retrieve file pointers within SAM text files.
ERROR ------------------------------------------------------------------------------------------

Created 2015-07-21 13:25:52 | Updated | Tags: error splitncigarreads code-exception
Comments (9)

I am following the documentation of "Best Practices for Variant Discovery in RNAseq data" and I am at the SplitNCigarReads -step. All of the previous steps has worked out fine. (NOTE:I did the mapping of my reads with TopHat instead of STAR).

When I run the following: java -jar GenomeAnalysisTK.jar -T SplitNCigarReads -R reference.fa -I reordered_acc_reference.bam -o split.bam -fixMisencodedQuals -U ALLOW_N_CIGAR_READS

I get the following error message:

ERROR stack trace

java.lang.IllegalArgumentException at java.nio.ByteBuffer.allocate(ByteBuffer.java:330) at htsjdk.samtools.reference.IndexedFastaSequenceFile.getSubsequenceAt(IndexedFastaSequenceFile.java:195) at org.broadinstitute.gatk.utils.fasta.CachingIndexedFastaSequenceFile.getSubsequenceAt(CachingIndexedFastaSequenceFile.java:329) at org.broadinstitute.gatk.tools.walkers.rnaseq.OverhangFixingManager$Splice.initialize(OverhangFixingManager.java:365) at org.broadinstitute.gatk.tools.walkers.rnaseq.OverhangFixingManager.addSplicePosition(OverhangFixingManager.java:171) at org.broadinstitute.gatk.tools.walkers.rnaseq.SplitNCigarReads.splitReadBasedOnCigar(SplitNCigarReads.java:280) at org.broadinstitute.gatk.tools.walkers.rnaseq.SplitNCigarReads.splitNCigarRead(SplitNCigarReads.java:233) at org.broadinstitute.gatk.tools.walkers.rnaseq.SplitNCigarReads.reduce(SplitNCigarReads.java:210) at org.broadinstitute.gatk.tools.walkers.rnaseq.SplitNCigarReads.reduce(SplitNCigarReads.java:118) at org.broadinstitute.gatk.engine.traversals.TraverseReadsNano$TraverseReadsReduce.apply(TraverseReadsNano.java:251) at org.broadinstitute.gatk.engine.traversals.TraverseReadsNano$TraverseReadsReduce.apply(TraverseReadsNano.java:240) at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:279) at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.gatk.engine.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:102) at org.broadinstitute.gatk.engine.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:56) at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:108) at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:315) at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.4-46-gbc02625):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR

##### ERROR MESSAGE: Code exception (see stack trace for error itself)

Can anyone help me understand what is wrong?


Created 2015-06-23 16:19:39 | Updated | Tags: vcf developer bam bug error
Comments (4)

I hava a question wish to get help from the developers: I am using GATK with two modles: 1, I just use the UnifiedGenotyper to call the variants from a prepared bam file, then I get a vcf file. (Call it A.vcf) 2, I run the UnifiedGenotype by Chr, one by one, say, by using the "-L" arg, then I get a sets of small vcf files.

But, the result of ChrY is confusing me a lot.... the ChrY part in the A.vcf is QUITE different from the small vcf file that generated by "-L chrY", the difference seems to be larger than 50%. That means, the result is DIFFERENT for chrY. However, I have also checked the other Chromosomes, the difference is slight. ONLY the ChrY has this problem.

Our script pasted here:

java -Xmx30g -jar /data/SG/Env/software_installed/GenomeAnalysisTK.jar \ -L xx \ # I add -L option here when I do step 2. when I generate A.vcf ,I didn't add -L here -R ucsc.hg19.fasta \ -T RealignerTargetCreator \ -I ${sampleName}.bam \ -o ${sampleName}.intervals \ -known Mills_and_1000G_gold_standard.indels.hg19.sites.vcf \ -known 1000G_phase1.indels.hg19.sites.vcf

java -Xmx30g -jar /data/SG/Env/software_installed/GenomeAnalysisTK.jar \ -L xx \ # I add -L option here when I do step 2. when I generate A.vcf ,I didn't add -L here -R ucsc.hg19.fasta \ -T IndelRealigner \ -targetIntervals ${sampleName}.intervals \ -I ${sampleName}.bam \ -o ${sampleName}.realigned.bam \ -known Mills_and_1000G_gold_standard.indels.hg19.sites.vcf \ -known 1000G_phase1.indels.hg19.sites.vcf

samtools index ${sampleName}.realigned.bam

java -Xmx30g -jar /data/SG/Env/software_installed/GenomeAnalysisTK.jar \ -L xx \ # I add -L option here when I do step 2. when I generate A.vcf ,I didn't add -L here -R ucsc.hg19.fasta \ -T BaseRecalibrator \ -nct 8 \ -I ${sampleName}.realigned.bam \ -knownSites dbsnp_138.hg19.vcf \ -knownSites Mills_and_1000G_gold_standard.indels.hg19.sites.vcf \ -knownSites 1000G_phase1.indels.hg19.sites.vcf \ -o ${sampleName}.recal_data.grp

java -Xmx30g -jar /data/SG/Env/software_installed/GenomeAnalysisTK.jar \ -L xx \ # I add -L option here when I do step 2. when I generate A.vcf ,I didn't add -L here -R ucsc.hg19.fasta \ -T PrintReads \ -nct 8 \ -I ${sampleName}.realigned.bam \ -BQSR ${sampleName}.recal_data.grp \ -o ${sampleName}.realigned.recal.bam

samtools index ${sampleName}.realigned.recal.bam

java -Xmx30g -jar /data/SG/Env/software_installed/GenomeAnalysisTK.jar \ -L xx \ # I add -L option here when I do step 2. when I generate A.vcf ,I didn't add -L here -R ucsc.hg19.fasta \ -T UnifiedGenotyper \ -nct 8 \ -glm BOTH \ -I ${sampleName}.realigned.recal.bam \ -D dbsnp_138.hg19.vcf \ -o ${sampleName}.vcf \ #here A.vcf or small vcf generated -stand_call_conf 50.0 \ -stand_emit_conf 10.0 \ -dcov 200 \ -A AlleleBalance -A QualByDepth -A HaplotypeScore -A MappingQualityRankSumTest -A ReadPosRankSumTest -A FisherStrand -A RMSMappingQuality -A InbreedingCoeff -A Coverage

Wish you guys can offer me some help.

Thanks,


Created 2015-05-13 12:41:10 | Updated | Tags: indelrealigner bam error
Comments (2)

Hello, I run into a problem after the pre-processing, it seems that extra contigs where added to my bam file compared to the reference I used, which make the indel realigner step impossible to do. I have checked the headers of my file and the reference is the same but my bam file as a hundreds of additional contigs. Not sure what happen. The steps to get the bam where: - Aligned with bwa mem - Transform to bam and sort (Samtools) - Dedup (picard) - Add read group (picard) - Index bam (samtools) - Run Realigner target creator When I check the header of my bam file it still show the right contigs but when running it complains of difference (additional) compare to my reference. I am currently re-testing the whole pipeline on a single sample but if you have any pointer to what could cause this, maybe a problem with the bam formating? I am running GATK 3.3.0-g37228af Java 1.7 I have attached the ouput log from the command. Thanks,

Julien

PS: I attended your workshop in Cambridge!


Created 2015-04-23 00:51:15 | Updated | Tags: error code-exception local-realignment
Comments (3)

I got this error during local realignment

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.InternalError at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:838) at sun.misc.URLClassPath.getResource(URLClassPath.java:199) at sun.misc.URLClassPath.getResource(URLClassPath.java:251) at java.lang.ClassLoader.getBootstrapResource(ClassLoader.java:1305) at java.lang.ClassLoader.getResource(ClassLoader.java:1144) at java.lang.ClassLoader.getResource(ClassLoader.java:1142) at java.lang.ClassLoader.getSystemResource(ClassLoader.java:1267) at java.lang.ClassLoader.getSystemResourceAsStream(ClassLoader.java:1370) at java.lang.Class.getResourceAsStream(Class.java:2101) at javax.xml.stream.SecuritySupport$4.run(SecuritySupport.java:92) at java.security.AccessController.doPrivileged(Native Method) at javax.xml.stream.SecuritySupport.getResourceAsStream(SecuritySupport.java:87) at javax.xml.stream.FactoryFinder.findJarServiceProvider(FactoryFinder.java:335) at javax.xml.stream.FactoryFinder.find(FactoryFinder.java:302) at javax.xml.stream.FactoryFinder.find(FactoryFinder.java:215) at javax.xml.stream.XMLInputFactory.newInstance(XMLInputFactory.java:154) at org.simpleframework.xml.stream.NodeBuilder.(NodeBuilder.java:48) at org.simpleframework.xml.core.Persister.write(Persister.java:1000) at org.simpleframework.xml.core.Persister.write(Persister.java:982) at org.simpleframework.xml.core.Persister.write(Persister.java:963) at org.broadinstitute.sting.gatk.phonehome.GATKRunReport.postReportToStream(GATKRunReport.java:377) at org.broadinstitute.sting.gatk.phonehome.GATKRunReport.postReportToAWSS3(GATKRunReport.java:569) at org.broadinstitute.sting.gatk.phonehome.GATKRunReport.postReport(GATKRunReport.java:352) at org.broadinstitute.sting.gatk.CommandLineExecutable.generateGATKRunReport(CommandLineExecutable.java:16 at org.broadinstitute.sting.gatk.CommandLineExecutable.generateGATKRunReport(CommandLineExecutable.java:17 at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:122) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:107) Caused by: java.io.FileNotFoundException: /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75.x86_64/jre/lib/resources.jar at sun.misc.URLClassPath$JarLoader.getJarFile(URLClassPath.java:726) at sun.misc.URLClassPath$JarLoader.access$600(URLClassPath.java:591) at sun.misc.URLClassPath$JarLoader$1.run(URLClassPath.java:673) at sun.misc.URLClassPath$JarLoader$1.run(URLClassPath.java:666) at java.security.AccessController.doPrivileged(Native Method) at sun.misc.URLClassPath$JarLoader.ensureOpen(URLClassPath.java:665) at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:836) ... 28 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

Would you tell me how to correct this error :(


Created 2015-03-29 18:00:14 | Updated | Tags: best-practices snp vcf gatk error
Comments (12)

Hi,

I have discovered some unusual calls in my VCF file after using HaplotypeCaller. I am using version 3.3 of GATK. I applied VQSR as well as the genotype refinement workflow (CalculateGenotypePosteriors, etc.) to refine the calls, but the unusual calls did not get removed. I also calculated the number of Mendelian error just in the biallelic SNPs in my final VCF file (using PLINK) and found unusually high percentage for each of the 3 families I am studying: 0.153%, 0.167%, and 0.25%. The percentage of triallelic SNPs is also very high: 0.111%. Why are the error rates so high?

I used the following commands to call the variants and generate the initial VCF file:

HaplotypeCaller (generate gvcf files for each individual for each chromosome

java -Xmx1g -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R hs37d5.fa -I recal_${ROOT}.bam -o ${outpath}raw_${ROOT}.vcf --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000

GenotypeGVCFs (generate vcf files for each chr)

java -Xmx1g -jar GenomeAnalysisTK.jar -R hs37d5.fa -T GenotypeGVCFs -V vcfs.chr${numchr}.new.list -o final_chr${numchr}.vcf -L ${numchr}

CatVariants (generate 1 vcf file with all inds and all chrs)

java -Xmx1g -cp GenomeAnalysisTK.jar org.broadinstitute.gatk.tools.CatVariants -R hs37d5.fa -V final.new.list -out final_allHutt.vcf -assumeSorted

VQSR

java -Xmx4g -jar GenomeAnalysisTK.jar -T VariantRecalibrator -R hs37d5.fa -input final_allHutt.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.vcf -resource:omni,known=false,training=true,truth=true,prior=12.0 1000G_omni2.5.b37.vcf -resource:1000G,known=false,training=true,truth=false,prior=10.0 1000G_phase1.snps.high_confidence.b37.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 dbsnp_138.b37.vcf -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -an DP -mode SNP -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 --disable_auto_index_creation_and_locking_when_reading_rods -recalFile recalibrate_SNP_allHutt_2.recal -tranchesFile recalibrate_SNP_allHutt_2.tranches

Used excludeFiltered here

java -Xmx3g -jar GenomeAnalysisTK.jar -T ApplyRecalibration -R hs37d5.fa -input final_allHutt.vcf -mode SNP --ts_filter_level 99.9 --excludeFiltered --disable_auto_index_creation_and_locking_when_reading_rods -recalFile recalibrate_SNP_allHutt_2.recal -tranchesFile recalibrate_SNP_allHutt_2.tranches -o recalibrated_snps_raw_indels_allHutt_filteredout.vcf

java -Xmx3g -jar GenomeAnalysisTK.jar -T VariantRecalibrator -R hs37d5.fa -input recalibrated_snps_raw_indels_allHutt_filteredout.vcf -resource:mills,known=false,training=true,truth=true,prior=12.0 Mills_and_1000G_gold_standard.indels.b37.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 dbsnp_138.b37.vcf -an QD -an DP -an FS -an SOR -an ReadPosRankSum -an MQRankSum -mode INDEL -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 --maxGaussians 4 --disable_auto_index_creation_and_locking_when_reading_rods -recalFile recalibrate_INDEL_allHutt_filteredout.recal -tranchesFile recalibrate_INDEL_allHutt_filteredout.tranches

Used excludeFiltered here

java -Xmx3g -jar GenomeAnalysisTK.jar -T ApplyRecalibration -R hs37d5.fa -input recalibrated_snps_raw_indels_allHutt_filteredout.vcf -mode INDEL --ts_filter_level 99.0 --excludeFiltered --disable_auto_index_creation_and_locking_when_reading_rods -recalFile recalibrate_INDEL_allHutt_filteredout.recal -tranchesFile recalibrate_INDEL_allHutt_filteredout.tranches -o recalibrated_variants_allHutt_filteredout.vcf

Genotype Refinement Workflow

java -Xmx3g -jar GenomeAnalysisTK.jar -T CalculateGenotypePosteriors -R hs37d5.fa --supporting ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.sites.vcf -ped Hutt.ped -V recalibrated_variants_allHutt_filteredout.vcf -o recalibrated_variants_allHutt.postCGP.f.vcf

java -Xmx3g -jar GenomeAnalysisTK.jar -T VariantFiltration -R hs37d5.fa -V recalibrated_variants_allHutt.postCGP.f.vcf -G_filter "GQ < 20.0" -G_filterName lowGQ -o recalibrated_variants_allHutt.postCGP.Gfiltered.f.vcf

Again, the first genotype in this example (indel) passed VariantFiltration even though its coverage was zero (2/2:0,0,0:0:PASS)

The entire example is below:

1 20199272 . T TCTTC,C 3520.08 PASS AC=8,22;AF=0.160,0.440;AN=50;BaseQRankSum=-1.356e+00;ClippingRankSum=-1.267e+00;DP=487;FS=4.843;GQ_MEAN=27.84;GQ_STDDEV=40.31;InbreedingCoeff=0.1002;MLEAC=1,12;MLEAF=0.020,0.240;MQ=51.74;MQ0=0;MQRankSum=0.421;NCC=2;PG=0,0,0,0,0,0;QD=32.53;ReadPosRankSum=1.27;SOR=0.699;VQSLOD=0.687;culprit=FS GT:AD:DP:FT:GQ:PGT:PID:PL:PP 2/2:0,0,0:0:PASS:22:.:.:410,207,355,32,22,0:410,207,355,32,22,0 2/2:0,0,1:1:lowGQ:5:.:.:240,51,36,18,5,0:240,51,36,18,5,0 0/2:4,0,4:8:PASS:90:.:.:140,153,256,0,103,90:140,153,256,0,103,90 0/0:22,0,0:22:lowGQ:0:.:.:0,0,390,0,390,390:0,0,390,0,390,390 0/0:2,0,0:2:lowGQ:3:.:.:0,3,45,3,45,45:0,3,45,3,45,45 2/2:0,0,3:3:lowGQ:11:.:.:287,135,124,21,11,0:287,135,124,21,11,0 ./.:7,0,0:7:PASS 2/2:0,0,3:4:lowGQ:11:.:.:282,126,115,22,11,0:282,126,115,22,11,0 0/2:10,0,0:10:lowGQ:5:.:.:27,5,494,0,411,405:27,5,494,0,411,405 0/2:7,0,0:7:lowGQ:13:.:.:13,15,502,0,303,288:13,15,502,0,303,288 0/1:8,6,0:14:PASS:99:.:.:194,0,255,218,273,491:194,0,255,218,273,491 0/0:18,0,0:18:PASS:52:.:.:0,52,755,52,755,755:0,52,755,52,755,755 2/2:0,0,0:0:PASS:23:.:.:305,168,416,23,30,0:305,168,416,23,30,0 0/2:0,0,4:4:lowGQ:14:.:.:40,14,634,0,185,699:40,14,634,0,185,699 0/0:19,0,0:19:PASS:58:.:.:0,58,824,58,824,824:0,58,824,58,824,824 0/0:1,0,0:1:lowGQ:6:0|1:20199257_CT_C:0,6,91,6,91,91:0,6,91,6,91,91 1/1:0,0,0:0:lowGQ:2:.:.:177,11,0,12,2,44:177,11,0,12,2,44 0/1:0,0,3:3:PASS:34:.:.:94,0,388,34,38,304:94,0,388,34,38,304 0/2:15,0,2:17:lowGQ:18:0|1:20199249_CT_C:18,64,695,0,632,624:18,64,695,0,632,624 1/1:0,0,0:0:lowGQ:8:.:.:133,8,0,101,17,265:133,8,0,101,17,265 0/2:3,0,0:3:PASS:25:.:.:129,25,484,0,121,94:129,25,484,0,121,94 0/2:2,0,0:2:PASS:38:.:.:185,38,644,0,88,42:185,38,644,0,88,42 0/2:2,0,0:2:lowGQ:14:.:.:256,14,293,0,57,41:256,14,293,0,57,41./.:11,0,0:11:PASS 1/2:0,0,1:1:lowGQ:14:.:.:115,24,14,36,0,359:115,24,14,36,0,359 1/2:0,0,1:1:PASS:28:.:.:188,39,28,35,0,206:188,39,28,35,0,206 2/2:0,0,3:3:lowGQ:8:1|1:20199249_CT_C:88,88,89,8,9,0:88,88,89,8,9,0

Why are some genotypes being passed when there is no support for their genotype? Why are the Mendelian error rates so high?

Thank you very much in advance, Alva


Created 2015-03-17 08:53:24 | Updated | Tags: bqsr error
Comments (3)

I used the following script:

java -Xmx5g -jar $gatk -T BaseRecalibrator -R $ref_dir/ucsc.hg19.fasta -I $bam_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.dedupped.realigned.bam -knownSites $ref_dir/dbsnp_138.hg19.vcf -knownSites $ref_dir/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf -knownSites $ref_dir/1000G_phase1.indels.hg19.sites.vcf -o $other_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.recal.grp

java -Xmx5g -jar $gatk -T BaseRecalibrator -R $ref_dir/ucsc.hg19.fasta -I $bam_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.dedupped.realigned.bam -BQSR $other_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.recal.grp -o $other_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.post.recal.grp -knownSites $ref_dir/dbsnp_138.hg19.vcf -knownSites $ref_dir/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf -knownSites $ref_dir/1000G_phase1.indels.hg19.sites.vcf

java -Xmx5g -jar $gatk -T AnalyzeCovariates -R $ref_dir/ucsc.hg19.fasta -before $other_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.recal.grp -after $other_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.post.recal.grp -plots $other_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.recalibration_plots.pdf

java -Xmx5g -jar $gatk -T PrintReads -R $ref_dir/ucsc.hg19.fasta -I $bam_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.dedupped.realigned.bam -BQSR $other_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.recal.grp -o $bam_dir/T-SZ-03-1_NHDE0249-11_C3AP0ACXX_L3.dedupped.realigned.recal.bam

The I got the following error message for the first step:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.RuntimeException: java.io.IOException: Input/output error at htsjdk.tribble.readers.AsciiLineReaderIterator$TupleIterator.advance(AsciiLineReaderIterator.java:87) at htsjdk.tribble.readers.AsciiLineReaderIterator$TupleIterator.advance(AsciiLineReaderIterator.java:74) at htsjdk.samtools.util.AbstractIterator.next(AbstractIterator.java:57) at htsjdk.tribble.readers.AsciiLineReaderIterator.next(AsciiLineReaderIterator.java:47) at htsjdk.tribble.readers.AsciiLineReaderIterator.next(AsciiLineReaderIterator.java:25) at htsjdk.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:79) at htsjdk.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:41) at htsjdk.tribble.AbstractFeatureCodec.decodeLoc(AbstractFeatureCodec.java:40) at htsjdk.tribble.index.IndexFactory$FeatureIterator.readNextFeature(IndexFactory.java:478) at htsjdk.tribble.index.IndexFactory$FeatureIterator.next(IndexFactory.java:440) at htsjdk.tribble.index.IndexFactory.createIndex(IndexFactory.java:338) at htsjdk.tribble.index.IndexFactory.createDynamicIndex(IndexFactory.java:312) at org.broadinstitute.gatk.engine.refdata.tracks.RMDTrackBuilder.createIndexInMemory(RMDTrackBuilder.java:402) at org.broadinstitute.gatk.engine.refdata.tracks.RMDTrackBuilder.loadIndex(RMDTrackBuilder.java:288) at org.broadinstitute.gatk.engine.refdata.tracks.RMDTrackBuilder.getFeatureSource(RMDTrackBuilder.java:225) at org.broadinstitute.gatk.engine.refdata.tracks.RMDTrackBuilder.createInstanceOfTrack(RMDTrackBuilder.java:148) at org.broadinstitute.gatk.engine.datasources.rmd.ReferenceOrderedQueryDataPool.(ReferenceOrderedDataSource.java:208) at org.broadinstitute.gatk.engine.datasources.rmd.ReferenceOrderedDataSource.(ReferenceOrderedDataSource.java:88) at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.getReferenceOrderedDataSources(GenomeAnalysisEngine.java:997) at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.initializeDataSources(GenomeAnalysisEngine.java:779) at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:290) at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107) Caused by: java.io.IOException: Input/output error at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:224) at htsjdk.tribble.readers.PositionalBufferedStream.fill(PositionalBufferedStream.java:127) at htsjdk.tribble.readers.PositionalBufferedStream.peek(PositionalBufferedStream.java:118) at htsjdk.tribble.readers.PositionalBufferedStream.read(PositionalBufferedStream.java:57) at htsjdk.tribble.readers.AsciiLineReader.readLine(AsciiLineReader.java:80) at htsjdk.tribble.readers.AsciiLineReader.readLine(AsciiLineReader.java:122) at htsjdk.tribble.readers.AsciiLineReaderIterator$TupleIterator.advance(AsciiLineReaderIterator.java:85) ... 24 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.3-0-g37228af):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: java.io.IOException: Input/output error

Created 2015-03-12 16:15:59 | Updated | Tags: unifiedgenotyper snp error indel
Comments (3)

Hello,

when I run the UnifiedGenotyper to call INDELs, I get the following error (detailed command line output below) HAPLOTYPE_MAX_LENGTH must be > 0 but got 0

When I call only SNPs instead, it doe not occur. I have searched to find an answer to why this is happening, but cannot figure out the reason. Could this be a bug? I get the error no matter if I run the Realigner before or not.

Did you observe this problem before?

Command line output: java -Xmx10g -jar ~/work/tools/GenomeAnalysisTK.jar -T UnifiedGenotyper -R /media/rebecca/UUI/Work/BputSem/BputSem_gapfilled.final.fa -I realigned_A.bam -gt_mode DISCOVERY -glm INDEL -stand_call_conf 30 -stand_emit_conf 10 -o rawINDELS_q30_A.vcf -ploidy 10

INFO 15:50:47,987 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:50:47,990 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.3-0-g37228af, Compiled 2014/10/24 01:07:22 INFO 15:50:47,990 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 15:50:47,990 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 15:50:47,996 HelpFormatter - Program Args: -T UnifiedGenotyper -R /media/rebecca/UUI/Work/BputSem/BputSem_gapfilled.final.fa -I realigned_A.bam -gt_mode DISCOVERY -glm INDEL -stand_call_conf 30 -stand_emit_conf 10 -o rawINDELS_q30_A.vcf -ploidy 10 INFO 15:50:48,002 HelpFormatter - Executing as rebecca@rebecca-ThinkPad-T440s on Linux 3.13.0-44-generic amd64; OpenJDK 64-Bit Server VM 1.7.0_65-b32. INFO 15:50:48,002 HelpFormatter - Date/Time: 2015/03/12 15:50:47 INFO 15:50:48,002 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:50:48,002 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:50:48,132 GenomeAnalysisEngine - Strictness is SILENT INFO 15:50:48,402 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 250 INFO 15:50:48,409 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 15:50:48,447 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04 INFO 15:50:48,594 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files INFO 15:50:48,622 GenomeAnalysisEngine - Done preparing for traversal INFO 15:50:48,622 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 15:50:48,622 ProgressMeter - | processed | time | per 1M | | total | remaining INFO 15:50:48,623 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime INFO 15:50:48,658 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. INFO 15:50:48,658 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. INFO 15:51:06,585 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalArgumentException: HAPLOTYPE_MAX_LENGTH must be > 0 but got 0 at org.broadinstitute.gatk.utils.pairhmm.PairHMM.initialize(PairHMM.java:97) at org.broadinstitute.gatk.utils.pairhmm.N2MemoryPairHMM.initialize(N2MemoryPairHMM.java:60) at org.broadinstitute.gatk.utils.pairhmm.LoglessPairHMM.initialize(LoglessPairHMM.java:66) at org.broadinstitute.gatk.utils.pairhmm.PairHMM.computeLikelihoods(PairHMM.java:194) at org.broadinstitute.gatk.tools.walkers.indels.PairHMMIndelErrorModel.computeGeneralReadHaplotypeLikelihoods(PairHMMIndelErrorModel.java:461) at org.broadinstitute.gatk.tools.walkers.genotyper.GeneralPloidyIndelGenotypeLikelihoods.add(GeneralPloidyIndelGenotypeLikelihoods.java:201) at org.broadinstitute.gatk.tools.walkers.genotyper.GeneralPloidyIndelGenotypeLikelihoods.add(GeneralPloidyIndelGenotypeLikelihoods.java:124) at org.broadinstitute.gatk.tools.walkers.genotyper.GeneralPloidyGenotypeLikelihoodsCalculationModel.getLikelihoods(GeneralPloidyGenotypeLikelihoodsCalculationModel.java:270) at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateLikelihoods(UnifiedGenotypingEngine.java:317) at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateLikelihoodsAndGenotypes(UnifiedGenotypingEngine.java:201) at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:379) at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:151) at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267) at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255) at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144) at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92) at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48) at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99) at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:319) at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.3-0-g37228af):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: HAPLOTYPE_MAX_LENGTH must be > 0 but got 0
ERROR ------------------------------------------------------------------------------------------

Created 2015-03-04 13:15:20 | Updated 2015-03-04 13:26:32 | Tags: error haplotypecaller-in-gvcf 3-3-0 code-exception
Comments (7)

The following error i got when running on one of the cluster nodes on a single sample out of multiple. On the headnode the error resolved, but why I don't know.

Does this error sound familiar? If so, link me to the fix / discussion. If not, please do not spend too much time on it.

This is the error log:

``` INFO 21:49:26,894 HelpFormatter - -------------------------------------------------------------------------------- INFO 21:49:26,898 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.3-0-g37228af, Compiled 2014/10/24 01:07:22 INFO 21:49:26,898 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 21:49:26,903 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 21:49:26,909 HelpFormatter - Program Args: -T HaplotypeCaller -R human_g1k_v37.fasta --dbsnp dbsnp_138.b37.vcf -I RCC-ER.bam -stand_call_conf 10.0 -stand_emit_conf 30.0 -o RCC-ER.g.vcf -nct 8 --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000 INFO 21:49:26,915 HelpFormatter - Executing as mterpstra@targetgcc09-mgmt on Linux 3.0.101-0.7.17-default amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_25-b15. INFO 21:49:26,915 HelpFormatter - Date/Time: 2015/03/03 21:49:26 INFO 21:49:26,915 HelpFormatter - -------------------------------------------------------------------------------- INFO 21:49:26,916 HelpFormatter - -------------------------------------------------------------------------------- INFO 21:49:27,200 GenomeAnalysisEngine - Strictness is SILENT INFO 21:49:27,354 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 250 INFO 21:49:27,366 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 21:49:27,436 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.07 INFO 21:49:27,477 HCMappingQualityFilter - Filtering out reads with MAPQ < 20 INFO 21:49:27,810 MicroScheduler - Running the GATK in parallel mode with 8 total threads, 8 CPU thread(s) for each of 1 data thread(s), of 48 processors available on this machine INFO 21:49:27,901 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files INFO 21:49:28,234 GenomeAnalysisEngine - Done preparing for traversal INFO 21:49:28,235 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 21:49:28,235 ProgressMeter - | processed | time | per 1M | | total | remaining INFO 21:49:28,236 ProgressMeter - Location | active regions | elapsed | active regions | completed | runtime | runtime INFO 21:49:28,237 HaplotypeCaller - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output INFO 21:49:28,237 HaplotypeCaller - All sites annotated with PLs forced to true for reference-model confidence output INFO 21:49:28,445 HaplotypeCaller - Using global mismapping rate of 45 => -4.5 in log10 likelihood units INFO 21:49:28,447 PairHMM - Performance profiling for PairHMM is disabled because HaplotypeCaller is being run with multiple threads (-nct>1) option Profiling is enabled only when running in single thread mode

INFO 21:49:51,883 VectorLoglessPairHMM - libVectorLoglessPairHMM unpacked successfully from GATK jar file INFO 21:49:51,884 VectorLoglessPairHMM - Using vectorized implementation of PairHMM INFO 21:49:55,536 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NullPointerException at java.lang.String.checkBounds(String.java:374) at java.lang.String.(String.java:314) at htsjdk.samtools.util.StringUtil.bytesToString(StringUtil.java:301) at htsjdk.samtools.BAMRecord.decodeReadName(BAMRecord.java:331) at htsjdk.samtools.BAMRecord.getReadName(BAMRecord.java:220) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingGraph.addRead(ReadThreadingGraph.java:585) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.createGraph(ReadThreadingAssembler.java:178) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.assemble(ReadThreadingAssembler.java:117) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.LocalAssemblyEngine.runLocalAssembly(LocalAssemblyEngine.java:169) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.assembleReads(HaplotypeCaller.java:1163) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:1000) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:221) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:709) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:705) at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler$ReadMapReduceJob.run(NanoScheduler.java:471) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.3-0-g37228af):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

```


Created 2015-02-13 11:14:24 | Updated | Tags: bam gatk error
Comments (2)

Hi,

with GATK 3.3-0 I am confronted with an error that was present in a much older version, but seemed resolved about a year ago:

ERROR MESSAGE: There is no space left on the device, so writing failed

There is 8TB left on the drive, no user limit. Sometimes re-running the exact same job works, sometimes not. Some jobs keep failing despite asking for an insane amount of memory on the cluster, given these are RNAseq bam files, the largest one being less than 7GB.

For example:

qsub -b y -cwd -N step3_145 -o step3_145.o -e step3_145.e -V -l h_vmem=40G /share/apps/java/oracle/1.8.0_11/bin/java -Xmx35G -jar /data/home/hhx037/GATK-3.3.0/GenomeAnalysisTK.jar -T SplitNCigarReads -R Homo_sapiens.GRCh37.75.dna.1-22XYMT.fa -I Analyses/file_dedup.bam -o Analyses/file_splittedcigar.bam -rf ReassignOneMappingQuality -RMQF 255 -RMQT 60 -U ALLOW_N_CIGAR_READS

Here is the log:

INFO 10:50:51,568 HelpFormatter - Executing as hhx037@panos1 on Linux 2.6.32-431.1.2.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_11-b12. INFO 10:50:51,571 HelpFormatter - Date/Time: 2015/02/13 10:50:51 INFO 10:50:51,571 HelpFormatter - -------------------------------------------------------------------------------- INFO 10:50:51,576 HelpFormatter - -------------------------------------------------------------------------------- INFO 10:50:52,503 GenomeAnalysisEngine - Strictness is SILENT INFO 10:50:52,827 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 10:50:52,861 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 10:50:52,876 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01 INFO 10:50:53,021 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files INFO 10:50:53,027 GenomeAnalysisEngine - Done preparing for traversal INFO 10:50:53,030 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 10:50:53,030 ProgressMeter - | processed | time | per 1M | | total | remaining INFO 10:50:53,030 ProgressMeter - Location | reads | elapsed | reads | completed | runtime | runtime INFO 10:50:53,047 ReadShardBalancer$1 - Loading BAM index data INFO 10:50:53,050 ReadShardBalancer$1 - Done loading BAM index data INFO 10:51:23,404 ProgressMeter - 1:1477348 702953.0 30.0 s 43.0 s 0.0% 17.5 h 17.5 h INFO 10:52:32,660 ProgressMeter - 1:16909108 1202983.0 99.0 s 82.0 s 0.5% 5.0 h 5.0 h INFO 10:53:09,769 ProgressMeter - 1:21069702 1302985.0 2.3 m 104.0 s 0.7% 5.6 h 5.5 h INFO 10:53:49,083 ProgressMeter - 1:27951393 1803181.0 2.9 m 97.0 s 0.9% 5.4 h 5.4 h INFO 10:54:29,275 ProgressMeter - 1:32739969 2103299.0 3.6 m 102.0 s 1.1% 5.7 h 5.6 h INFO 10:55:09,177 ProgressMeter - 1:36643589 2203300.0 4.3 m 116.0 s 1.2% 6.0 h 5.9 h INFO 10:55:45,643 ProgressMeter - 1:39854010 2303302.0 4.9 m 2.1 m 1.3% 6.3 h 6.2 h INFO 10:56:25,147 ProgressMeter - 1:40542516 2403303.0 5.5 m 2.3 m 1.3% 7.0 h 6.9 h INFO 10:57:10,934 ProgressMeter - 1:40654849 2503322.0 6.3 m 2.5 m 1.3% 8.0 h 7.9 h INFO 10:57:54,084 ProgressMeter - 1:43162895 2503322.0 7.0 m 2.8 m 1.4% 8.4 h 8.3 h INFO 10:58:24,149 ProgressMeter - 1:45244391 2703426.0 7.5 m 2.8 m 1.5% 8.6 h 8.4 h INFO 10:58:56,749 ProgressMeter - 1:53716450 2803427.0 8.1 m 2.9 m 1.7% 7.7 h 7.6 h INFO 10:59:38,928 ProgressMeter - 1:86821106 3103432.0 8.8 m 2.8 m 2.8% 5.2 h 5.1 h INFO 11:00:11,337 ProgressMeter - 1:93301870 3303437.0 9.3 m 2.8 m 3.0% 5.1 h 5.0 h INFO 11:01:13,113 ProgressMeter - 1:115252321 3803590.0 10.3 m 2.7 m 3.7% 4.6 h 4.5 h INFO 11:02:02,172 ProgressMeter - 1:145441389 4303778.0 11.2 m 2.6 m 4.7% 4.0 h 3.8 h INFO 11:02:38,237 ProgressMeter - 1:150547232 4703871.0 11.8 m 2.5 m 4.9% 4.0 h 3.8 h INFO 11:03:09,693 ProgressMeter - 1:153362937 5003904.0 12.3 m 2.5 m 5.0% 4.1 h 3.9 h INFO 11:03:39,934 ProgressMeter - 1:155984762 5403968.0 12.8 m 2.4 m 5.0% 4.2 h 4.0 h INFO 11:04:05,477 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 3.3-0-g37228af):
ERROR
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR
ERROR MESSAGE: There is no space left on the device, so writing failed
ERROR ------------------------------------------------------------------------------------------

I understand temporary files may be large, but not that large. Are the temporary files written in the working directory (as I believe should be the case), or are they written in GATK installation directory?

Also, note I never run into this problem with the previous version.

Any idea?

Cheers,

Stephane


Created 2015-01-13 09:49:04 | Updated | Tags: haplotypecaller error variant-calling kras
Comments (1)

Hi, I'm using GATK haplotypecaller in order to detect varianti in a tumor sample analyzed with Illumina WES 100X2 paired end mode. I know KRAS p.G12 variant (chr12:25398284) is present in my sample (because previously seen with Sanger sequencing, KRAS is the oncogenic event in this kind of tumor). I aligned the fastq file with bwa after quality control e adapter trimmig. In my bam file I'm able to view the variant at genomic coordinate chr12:25398284 as you can see in the picture (IGV):

however GATK haplotypecaller doesn't call the variant anyway. This is my basic command line:

java -Xmx4g -jar /opt/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar -T HaplotypeCaller -R human_g1k_v37.fasta -I my.bam -stand_call_conf 0 -stand_emit_conf 0 -o output.file

and I tryed to tune a a lot of parameter such as stand_call_conf; -stand_emit_conf; --heterozygosity; --min_base_quality_score and so on.

This is the pileup string of the chr12:25398284 coordinate in my bam file 12 25398284 C 55 ,,T.,,...,,,,,,T.......,t,t.....,,,t..,,,,,,,,,,,,.^],^],^],^], BB@ECAFCECBCBBB@DBCCCDDCABADBDBCCDD@BBEADDEDBCADBB@@BAC

The base quality is good and the mapping quality too, but the haplotypecaller does not determine any active region around the position chr12:25398284

Any suggestion about the reason of misscalling??

Many thanks, Valentina


Created 2014-12-19 17:35:25 | Updated | Tags: combinevariants vcf error
Comments (8)

Hi GATK team,

I am attempting to combine a HaplotypeCaller generated VCF with some indels called using pindel using the following arguments (GATK v3.3-0-g37228af):

-R /data/shared/ref/b37/human_g1k_v37.fasta -T CombineVariants --variant:GATK var.HiSeqDecember.raw.vcf --variant:pindel pindel_combined.vcf -o var.HiSeqDecember.pindel.raw.vcf -genotypeMergeOptions PRIORITIZE -priority GATK,pindel

However I get the following error:

ERROR MESSAGE: Badly formed variant context at location 1:157718231; getEnd() was 157718235 but this VariantContext contains an END key with value 157718231

The variants in question are (from GATK):

1 157718231 . CAAAT C,CAAATAAAT 2533.56 PASS AC=3,13;AF=0.125,0.542;AN=24;BaseQRankSum=1.762;ClippingRankSum=-0.327;DP=126;FS=0.000;HOMLEN=39;HOMSEQ=AAATAAATAAATAAATAAATAAATAAATAAATAAATAAA;InbreedingCoeff=-0.1260;MLEAC=3,13;MLEAF=0.125,0.542;MQ=70.00;MQ0=0;MQRankSum=0.920;QD=22.22;ReadPosRankSum=-0.893;SOR=0.976;SVLEN=4;SVTYPE=INS;set=Intersection GT:DP:GQ 0/0:10:30 0/2:9:18 2/2:6:18 2/2:5:15 0/1:10:99 0/2:14:99 2/2:8:24 2/2:6:18 2/2:7:21 0/2:17:99 0/1:5:75 0/1:6:27

and (from pindel):

1 157718231 . C CAAAT . PASS AC=2;AF=0.143;AN=14;END=157718231;HOMLEN=39;HOMSEQ=AAATAAATAAATAAATAAATAAATAAATAAATAAATAAA;SVLEN=4;SVTYPE=INS;set=variant3-variant4-variant6-variant7-variant8-variant9-variant10 GT:AD ./. ./. 0/0:0,7 0/0:0,6 ./. 0/0:0,9 0/0:0,8 0/0:0,7 0/0:0,8 1/1:0,12 ./. ./.

It is worth noting that the pindel VCF here was merged together from several pindel-generated VCFs using CombineVariants without any complaint from the GATK. It looks to me that the END key is correct for the pindel variant (a simple insertion), but the GATK is confused due to the mixed deletion/insertion variant generated by the HaplotypeCaller at the same position (without an END key).

I can rerun the command after stripping all END tags from the pindel VCF and the command completes successfully, so this is not a showstopper for me but I assume this is a bug(?) and if so, it would be great if there were a fix.

Cheers,

Dave


Created 2014-12-19 01:00:34 | Updated | Tags: realignertargetcreator error
Comments (9)

I keep getting the same error message when running a command in GATK. I am using GATK 3.3 Eventually I would like to call SNPs, but want to realign around InDels first.

To make sure that my input files are correct, I used Picard: java -jar /programs/ValidateSamFile.jar I=merged_sorted_fixed.bam No errors were found.

I wrote: java -jar /programs/GenomeAnalysisTK/GenomeAnalysisTK.jar \ -T RealignerTargetCreator\ -R target_sequences.fasta -o merged_output.intervals \ -I merged_sorted_fixed.bam \ --minReadsAtLocus 4 \ --mismatchFraction 0

And this is the output:

" Adding rod class GFF Adding rod class dbSNP Adding rod class HapMapAlleleFrequencies Adding rod class SAMPileup Adding rod class GELI Adding rod class RefSeq Adding rod class Table Adding rod class PooledEM Adding rod class 1KGSNPs Adding rod class SangerSNP Adding rod class HapMapGenotype Adding rod class Intervals Adding rod class Variants 76 [main] INFO org.broadinstitute.sting.gatk.WalkerManager - plugin directory: /scratch/atk25/programs/GenomeAnalysisTK/walkers 100 [main] INFO org.broadinstitute.sting.gatk.WalkerManager - * Adding module CountLoci 101 [main] INFO org.broadinstitute.sting.gatk.WalkerManager - * Adding module PrintReads 101 [main] INFO org.broadinstitute.sting.gatk.WalkerManager - * Adding module Pileup 101 [main] INFO org.broadinstitute.sting.gatk.WalkerManager - * Adding module DepthOfCoverage 105 [main] INFO org.broadinstitute.sting.gatk.WalkerManager - * Adding module ValidatingPileup 106 [main] INFO org.broadinstitute.sting.gatk.WalkerManager - * Adding module CountReads 139 [main] FATAL root - Exception caught by base Command Line Program, with message: null 139 [main] FATAL root - with cause: null java.lang.NullPointerException at org.broadinstitute.sting.gatk.WalkerManager.createWalkerByName(WalkerManager.java:93) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.getWalkerByName(GenomeAnalysisEngine.java:157) at org.broadinstitute.sting.gatk.CommandLineGATK.getArgumentSources(CommandLineGATK.java:117) at org.broadinstitute.sting.utils.cmdLine.CommandLineProgram.start(CommandLineProgram.java:199) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:58) java.lang.RuntimeException: java.lang.NullPointerException at org.broadinstitute.sting.utils.cmdLine.CommandLineProgram.start(CommandLineProgram.java:279) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:58) Caused by: java.lang.NullPointerException at org.broadinstitute.sting.gatk.WalkerManager.createWalkerByName(WalkerManager.java:93) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.getWalkerByName(GenomeAnalysisEngine.java:157) at org.broadinstitute.sting.gatk.CommandLineGATK.getArgumentSources(CommandLineGATK.java:117) at org.broadinstitute.sting.utils.cmdLine.CommandLineProgram.start(CommandLineProgram.java:199)

... 1 more

An error has occurred. Please check your command line arguments for any typos or inconsistencies."

I'd really appreciate any help!


Created 2014-12-17 18:04:27 | Updated | Tags: haplotypecaller gatk error rnaseq genotyping genotyping-mode
Comments (1)

Hi, I'm currently trying to use GATK to call variants from Human RNA seq data

So far, I've managed to do variant calling in all my samples following the GATK best practice guidelines. (using HaplotypeCaller in DISCOVERY mode on each sample separately)

But I'd like to go further and try to get the genotype in every sample, of each variant found in at least one sample. This, to differentiate for each variant, samples where that variant is absent (homozygous for reference allele) from samples where it is not covered (and therefore note genotyped).

To do so, I've first used CombineVariants to merge variants from all my samples and to create the list of variants to be genotype ${ALLELES}.vcf

I then try to regenotype my samples with HaplotypeCaller using the GENOTYPE_GIVEN_ALLELES mode and the same settings as before: my command is the following:

******java -jar ${GATKPATH}/GenomeAnalysisTK.jar -T HaplotypeCaller -R ${GENOMEFILE}.fa -I ${BAMFILE_CALIB}.bam --genotyping_mode GENOTYPE_GIVEN_ALLELES -alleles ${ALLELES}.vcf -out_mode EMIT_ALL_SITES -dontUseSoftClippedBases -stand_emit_conf 20 -stand_call_conf 20
-o ${SAMPLE}_genotypes_all_variants.vcf -mbq 25 -L ${CDNA_BED}.bed --dbsnp ${DBSNP}.vc**f


In doing so I invariably get the same error after calling 0.2% of the genome.

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at htsjdk.variant.variantcontext.VariantContext.getAlternateAllele(VariantContext.java:845) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.assignGenotypeLikelihoods(HaplotypeCallerGenotypingEngine.java:248) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:1059) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:221) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:709) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:705) at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:274) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78) at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99) at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:319) at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.3-0-g37228af):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Index: 3, Size: 3
ERROR ------------------------------------------------------------------------------------------

because the problem seemed to originate from getAlternateAllele, I tried to play with --max_alternate_alleles by setting it to 2 or 10, without success. I also checked my ${ALLELES}.vcf file to look for malformed Alternate alleles in the region where the GATK crashes (Chr 1, somewhere after 78Mb) , but I couldn't identify any... (I searched for Alternate alles that would not match the following extended regexpr '[ATGC,]+')

I would be grateful for any help you can provide. Thanks.


Created 2014-11-27 09:23:12 | Updated 2014-11-27 09:43:50 | Tags: depthofcoverage error
Comments (1)

I'm getting the following error in GATK 3.2-2:

MESSAGE: SAM/BAM file /home/dwragg/work/Analysis/TEST/AOC35_ATCACG_L008/AOC35_ATCACG_L008_bootstrap.bam is malformed: Program record with group id GATK PrintReads already exists in SAMFileHeader!

When attempting to calclate depth of coverage:

java -d64 -jar ${GATK}/GenomeAnalysisTK.jar \ -T DepthOfCoverage \ -R ${REF} \ -I ${OUT}/${SAMPLE}/${SAMPLE}_bootstrap.bam \ -o ${OUT}/${SAMPLE}/metrics/${SAMPLE}_GATKcov \ -ct 2 -ct 5 -ct 8 \ --omitDepthOutputAtEachBase \ --omitIntervalStatistics \ --omitLocusTable \ -l FATAL

I'm assuming this is not normal as everything was working fine in 3.1-1. Ironically I recently switched to 3.2-2 because the RealignerTargetCreator was giving me an Unsupported major.minor version 51.0 error. I'm running Java 1.7.0-b147.

[EDIT}

I've since noticed that the BAM file was created with GATK version 2.4-9-g532efad. So the 'current' and 'latest' software filing system in place here appears to have failed. I can confirm that GATK v3.2-2 DepthOfCoverage tool generates the above program record group id error. I'll chase the administrators down to ensure the latest GATK version is installed and start from scratch.


Created 2014-11-25 16:32:43 | Updated | Tags: unifiedgenotyper error
Comments (3)

When I run this command:

java -jar /storage/app/GATK_3.3/GenomeAnalysisTK.jar -R /storage/data/gabriel/ucsc/hg19/hg19.fasta -T UnifiedGenotyper -I SRR493939.valid.bam -o teste2.vcf

This error happens:

WARN 04:34:22,257 RestStorageService - Error Response: PUT '/DAXKr9W5oe2LWMJhFdSveQd50kAK6tex.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 393, Content-MD5: 33uF1RoVpa9w9uTWFUxP0g==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: df7b85d51a15a5af70f6e4d6154c4fd2, Date: Tue, 25 Nov 2014 06:34:20 GMT, Authorization: AWS AKIAI22FBBJ37D5X62OQ:agZvhjUckV8Jnth+aubQX9KViUo=, User-Agent: JetS3t/0.8.1 (Linux/3.13.0-30-generic; amd64; en; JVM 1.7.0_65), Host: broad.gsa.gatk.run.reports.s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: B6DD058B64A98870, x-amz-id-2: ADKbyes3Qb8ovjofT4tKpP3gV3yPuc9v5C157qzTkfII2q3U2ZwEpUgzy0eIOz+J, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Tue, 25 Nov 2014 14:39:19 GMT, Connection: close, Server: AmazonS3] WARN 04:34:22,846 RestStorageService - Adjusted time offset in response to RequestTimeTooSkewed error. Local machine and S3 server disagree on the time by approximately 29096 seconds. Retrying connection.

Can someone help me?


Created 2014-09-12 19:05:19 | Updated 2014-09-12 19:07:12 | Tags: variantrecalibrator error retrieve result
Comments (2)

My command lines are as following:

 java -Xmx8g -jar $CLASSPATH/GenomeAnalysisTK.jar \
 -T VariantRecalibrator \
 -R $GenomeReference \
 -input $InputVCF \
 -nt 6 \
 -resource:mills,known=true,training=true,truth=true,prior=12.0 $resource1 \
 -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 $resource2 \
 -an DP -an QD -an MQRankSum -an ReadPosRankSum -an FS -an MQ \
 --maxGaussians 8 \
 -mode INDEL \
 -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 \
 -log $IndelsOutput/Indels.log \
 -recalFile $IndelsOutput/exome.indels.vcf.recal \
 -tranchesFile $IndelsOutput/exome.indels.tranches \
 -rscriptFile $IndelsOutput/exome.indels.recal.plots.R

But I got the following error when running this:

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
org.broadinstitute.gatk.utils.exceptions.ReviewedGATKException: Unable to retrieve result
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:190)
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:314)
    at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
    at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107)
Caused by: java.lang.IllegalArgumentException: No data found.
    at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:83)
    at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:392)
    at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:138)
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.notifyTraversalDone(HierarchicalMicroScheduler.java:226)
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:183)
    ... 5 more
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 3.2-2-gec30cee):

So I am suspecting this is a bug with GATK 3.2.2 ?


Created 2014-08-19 14:40:22 | Updated | Tags: gatk error illegalargumentexception
Comments (14)

Hi, I suppose I had a problem with my bam file, but I don't know what I should look for.

Any help appreciated.

ERROR stack trace

java.lang.IllegalArgumentException: -1 does not represent a valid base character at org.broadinstitute.gatk.utils.genotyper.DiploidGenotype.createDiploidGenotype(DiploidGenotype.java:115) at org.broadinstitute.gatk.tools.walkers.genotyper.SNPGenotypeLikelihoodsCalculationModel.getLikelihoods(SNPGenotypeLikelihoodsCalculationModel.java:177) at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateLikelihoods(UnifiedGenotypingEngine.java:305)

cf attached file for complete stack.


Created 2014-07-16 12:35:30 | Updated | Tags: haplotypecaller error
Comments (7)

Hi,

I get an error when using HaplotypeCaller (GATK version 3.2 and latest nightly build) on a specific BAM File:

java.lang.IndexOutOfBoundsException: Index: 28, Size: 6 at java.util.LinkedList.checkElementIndex(Unknown Source) at java.util.LinkedList.get(Unknown Source) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.DanglingChainMergingGraph.mergeDanglingTail(DanglingChainMergingGraph.java:272) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.DanglingChainMergingGraph.recoverDanglingTail(DanglingChainMergingGraph.java:184) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.DanglingChainMergingGraph.recoverDanglingTails(DanglingChainMergingGraph.java:131) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.createGraph(ReadThreadingAssembler.java:202) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.assemble(ReadThreadingAssembler.java:114) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.LocalAssemblyEngine.runLocalAssembly(LocalAssemblyEngine.java:164) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.assembleReads(HaplotypeCaller.java:1022) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:882) at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:218) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704) at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:273) at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78) at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99) at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:314) at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107

It occurs in both normal VCF and GVCF mode: java -XX:ParallelGCThreads=1 -Xmx4g -jar GenomeAnalysisTK.jar \ -R hg19.fa -I error.bam \ -L test2.bed \ -T HaplotypeCaller -o test.gatk.ontarget.vcf

java -XX:ParallelGCThreads=1 -Xmx4g -jar GenomeAnalysisTK.jar \ -R hg19.fa -I error.bam \ -L test2.bed \ -T HaplotypeCaller --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000 -o test.gatk.ontarget.gvcf

The BAM file was produced according to the best practice guide. I narrowed the error down to 15 reads (see attached files).

Best Regards, Thomas


Created 2014-06-12 12:20:37 | Updated | Tags: error
Comments (1)

Hi,

I tried many times to download genomestrip but I got a error saying: No File The server could not find the file. The file you attempted to download may not be valid. Contact the server administrator. I also tried different browser. Can any one help me ? Thanks in advance.

Cheng


Created 2014-05-05 16:45:37 | Updated | Tags: haplotypecaller error
Comments (4)

I'm having trouble running the haplotypecaller on a few of my samples. Before I go full scale on all of my samples I am attempting the pipeline on a small number of my samples. When I try running haplotypecaller on my samples individually with the command below I get a stack trace error (also below). Usually I can figure out the stack trace errors and correct them, but I am having trouble with this one. With this error, the program runs for a time then errors out (sometime it runs longer then other times). It doesn't occur for every sample and when it does occur, if I re-run that same individual and same command again it sometimes goes to completion without errors. Other times I might have to run that individual/command multiple times before I can get it to complete without errors. Any insight would be greatly appreciated.

Command: java -Xmx16g -jar ~/programs/GenomeAnalysisTK.jar -T HaplotypeCaller -R ../miliaris_ref.fa -I realigned/27861_realigned.bam --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000 --max_alternate_alleles 2 -mbq 30 -recoverDanglingHeads -dontUseSoftClippedBases -stand_call_conf 4.0 -stand_emit_conf 3.0 -o rawSNPS_Q4/27861_rawSNPS_Q4.vcf -nct 8 -A HaplotypeScore -A FisherStrand -A BaseQualityRankSumTest -A MappingQualityRankSumTest -A ReadPosRankSumTest -A QualByDepth -A VariantType -A LowMQ

Error: . . . INFO 12:36:15,578 ProgressMeter - comp45464_c0_seq1:533 2.82e+06 2.5 m 53.0 s 7.1% 35.2 m 32.7 m INFO 12:36:45,579 ProgressMeter - comp45609_c0_seq1:269 2.84e+06 3.0 m 63.0 s 7.2% 41.9 m 38.9 m INFO 12:37:15,580 ProgressMeter - comp48164_c0_seq1:493 3.09e+06 3.5 m 67.0 s 7.8% 44.9 m 41.4 m WARN 12:37:24,038 ExactAFCalc - this tool is currently set to genotype at most 2 alternate alleles in a given context, but the context at comp48440_c0_seq1:357 has 6 alter$ INFO 12:37:45,581 ProgressMeter - comp53167_c0_seq2:450 3.67e+06 4.0 m 65.0 s 9.2% 43.3 m 39.3 m

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NullPointerException at java.lang.String.checkBounds(String.java:374) at java.lang.String.(String.java:314) at net.sf.samtools.util.StringUtil.bytesToString(StringUtil.java:301) at net.sf.samtools.BAMRecord.decodeReadName(BAMRecord.java:331) at net.sf.samtools.BAMRecord.getReadName(BAMRecord.java:220) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.readthreading.ReadThreadingGraph.addRead(ReadThreadingGraph.java:543) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.createGraph(ReadThreadingAssembler.java:163) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.assemble(ReadThreadingAssembler.java:112) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.LocalAssemblyEngine.runLocalAssembly(LocalAssemblyEngine.java:168) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.assembleReads(HaplotypeCaller.java:961) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:825) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:141) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler$ReadMapReduceJob.run(NanoScheduler.java:471) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk

Created 2014-04-06 19:36:24 | Updated 2014-04-06 19:39:05 | Tags: realignertargetcreator error incompatible-contigs
Comments (6)

Hello All,

I am running RealignerTargetCreator using GATK version GenomeAnalysisTK-1.2-4-gd9ea764 and I am getting the following error: `

ERROR MESSAGE: Input files reads and reference have incompatible contigs: Found contigs with the same name but different lengths:
ERROR contig reads = scaffold69676_size1796 / 3149
ERROR contig reference = scaffold69676_size1796 / 1758.
ERROR reads contigs = [scaffold1_size320545, scaffold2_size291774, scaffold3_size284740..........`

I already checked that I am using the right Reference FASTA file and the correct .bam file, that I have used for alignment before. Therefore, I am clueless why I am getting this error? I would appreciate your help regarding this problem. Any suggestion is welcome?

Thanks, Namrata


Created 2014-03-27 21:34:25 | Updated | Tags: variantrecalibrator error
Comments (2)

I am analyzing ION PGM data

i'm getting 25 variants in one individual with the HaplotypeCaller i try to run the VariantRecalibrator, but i get this error

WARN 18:26:10,578 VariantDataManager - WARNING: Training with very few variant sites! Please check the model reporting PDF to ensure the quality of the model is reliable. INFO 18:26:10,580 GaussianMixtureModel - Initializing model with 100 k-means iterations... INFO 18:26:10,636 VariantRecalibratorEngine - Finished iteration 0. INFO 18:26:10,639 VariantRecalibratorEngine - Convergence after 3 iterations! INFO 18:26:10,640 VariantDataManager - Training with worst 0 scoring variants --> variants with LOD <= -5.0000. INFO 18:26:11,854 ProgressMeter - chrM:14784 8.87e+06 30.0 s 3.0 s 100.0% 30.0 s 0.0 s INFO 18:26:12,444 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalArgumentException: No data found. at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:83) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:392) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:138) at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:116) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:107)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: No data found.
ERROR ------------------------------------------------------------------------------------------

the command line i use was

java -jar /home/horus/Instaladores/GenomeAnalysisTK-3.1-1/GenomeAnalysisTK.jar -T VariantRecalibrator -R /home/horus/Escritorio/PGM/primirna/references/hg19usar.fa -input 1_raw_variants.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 /home/horus/Escritorio/PGM/primirna/references/hapmap_3.3.b37.vcf_nuevo -resource:omni,known=false,training=true,truth=true,prior=12.0 /home/horus/Escritorio/PGM/primirna/references/1000G_omni2.5.b37.vcf_nuevo -resource:1000G,known=false,training=true,truth=false,prior=10.0 /home/horus/Escritorio/PGM/primirna/references/Mills_and_1000G_gold_standard.indels.b37.vcf_nuevo -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /home/horus/Escritorio/PGM/primirna/references/dbsnp_138.hg19.vcf -an DP -an QD -an FS -an MQRankSum -an ReadPosRankSum -mode SNP -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 -recalFile 1_recalibrate_SNP.recal -tranchesFile 1_recalibrate_SNP.tranches


Created 2014-03-19 20:13:36 | Updated | Tags: variantrecalibrator error
Comments (3)

Here is my error .. I lauch the same command on 3 svg generated by the same UnifiedGenotyper commen ..

thx!

`java -Djava.io.tmpdir=/scratch/ymb-542-aa/temp -Xmx22g -jar /rap/ymb-542-aa/utils/tools/dna_tools/post_alignment_tools/GenomeAnalysisTK-2.8-1/GenomeAnalysisTK.jar -nt 8 -et NO_ET -K /rap/ymb-542-aa/utils/tools/dna_tools/post_alignment_tools/GenomeAnalysisTK-2.8-1/raphael.poujol_umontreal.ca.key -T VariantRecalibrator -R /scratch/ymb-542-aa/pancreas/Genomes/hg19.fa -input /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf -mode SNP -resource:hapmap,known=false,training=true,truth=true,prior=15.0 /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf -resource:omni,known=false,training=true,truth=false,prior=10.0 /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf -an QD -an DP -an FS -an MQ -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0
-recalFile /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.snps.recal -tranchesFile /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.snps.tranches -rscriptFile /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.snps.plots.RINFO 20:41:18,337 HelpFormatter - -------------------------------------------------------------------------------- INFO 20:41:18,338 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 INFO 20:41:18,339 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 20:41:18,339 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 20:41:18,342 HelpFormatter - Program Args: -nt 8 -et NO_ET -K /rap/ymb-542-aa/utils/tools/dna_tools/post_alignment_tools/GenomeAnalysisTK-2.8-1/raphael.poujol_umontreal.ca.key -T VariantRecalibrator -R /scratch/ymb-542-aa/pancreas/Genomes/hg19.fa -input /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf -mode SNP -resource:hapmap,known=false,training=true,truth=true,prior=15.0 /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf -resource:omni,known=false,training=true,truth=false,prior=10.0 /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf -an QD -an DP -an FS -an MQ -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 -recalFile /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.snps.recal -tranchesFile /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.snps.tranches -rscriptFile /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.snps.plots.R INFO 20:41:18,342 HelpFormatter - Date/Time: 2014/03/18 20:41:18 INFO 20:41:18,342 HelpFormatter - -------------------------------------------------------------------------------- INFO 20:41:18,342 HelpFormatter - -------------------------------------------------------------------------------- INFO 20:41:18,367 ArgumentTypeDescriptor - Dynamically determined type of /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf to be VCF INFO 20:41:18,395 ArgumentTypeDescriptor - Dynamically determined type of /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf to be VCF INFO 20:41:18,400 ArgumentTypeDescriptor - Dynamically determined type of /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf to be VCF INFO 20:41:18,404 ArgumentTypeDescriptor - Dynamically determined type of /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf to be VCF INFO 20:41:19,154 GenomeAnalysisEngine - Strictness is SILENT INFO 20:41:19,258 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 INFO 20:41:19,313 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf INFO 20:41:19,346 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf INFO 20:41:19,447 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf INFO 20:41:19,473 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf INFO 20:41:19,561 MicroScheduler - Running the GATK in parallel mode with 8 total threads, 1 CPU thread(s) for each of 8 data thread(s), of 8 processors available on this machine INFO 20:41:19,608 GenomeAnalysisEngine - Preparing for traversal INFO 20:41:19,622 GenomeAnalysisEngine - Done preparing for traversal
INFO 20:41:19,622 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 20:41:19,622 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining WARN 20:41:19,637 Utils - ******************************************************************************** WARN 20:41:19,637 Utils - * WARNING: WARN 20:41:19,637 Utils - * WARN 20:41:19,637 Utils - * Rscript not found in environment path. WARN 20:41:19,637 Utils - * /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.snps.plots.R will be WARN 20:41:19,637 Utils - * generated but PDF plots will not. WARN 20:41:19,637 Utils - ******************************************************************************** INFO 20:41:19,640 TrainingSet - Found hapmap track: Known = false Training = true Truth = true Prior = Q15.0 INFO 20:41:19,640 TrainingSet - Found omni track: Known = false Training = true Truth = false Prior = Q10.0 INFO 20:41:19,641 TrainingSet - Found dbsnp track: Known = true Training = false Truth = false Prior = Q2.0 INFO 20:41:19,698 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf INFO 20:41:19,705 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf INFO 20:41:19,713 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf INFO 20:41:19,722 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf INFO 20:41:19,726 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf INFO 20:41:19,736 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf INFO 20:41:19,739 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf INFO 20:41:19,747 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf INFO 20:41:19,751 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf INFO 20:41:19,760 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf INFO 20:41:19,763 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf INFO 20:41:19,772 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Calling/sub1_focus.haplo.vcf INFO 20:41:19,775 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf INFO 20:41:19,845 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf INFO 20:41:19,856 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf INFO 20:41:19,932 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf INFO 20:41:19,947 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf INFO 20:41:21,428 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf INFO 20:41:21,437 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf INFO 20:41:21,492 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/hapmap_3.3.hg19.vcf INFO 20:41:21,503 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf INFO 20:41:21,514 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf INFO 20:41:21,526 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf INFO 20:41:21,538 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf INFO 20:41:21,593 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/1000G_omni2.5.hg19.vcf INFO 20:41:21,605 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf INFO 20:41:21,667 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf INFO 20:41:21,859 RMDTrackBuilder - Loading Tribble index from disk for file /scratch/ymb-542-aa/pancreas/Genomes/dbsnp_138.hg19_noXY.vcf INFO 20:41:49,625 ProgressMeter - chr6:33023946 2.48e+07 30.0 s 1.0 s 34.9% 85.0 s 55.0 s INFO 20:42:19,626 ProgressMeter - chr14:38996406 5.30e+07 60.0 s 1.0 s 76.3% 78.0 s 18.0 s INFO 20:42:36,649 VariantDataManager - QD: mean = 16.83 standard deviation = 7.62 INFO 20:42:36,653 VariantDataManager - DP: mean = 19.65 standard deviation = 12.45 INFO 20:42:36,658 VariantDataManager - FS: mean = 0.41 standard deviation = 1.14 INFO 20:42:36,660 VariantDataManager - MQ: mean = 40.98 standard deviation = 2.71 INFO 20:42:36,718 VariantDataManager - Annotations are now ordered by their information content: [MQ, DP, QD, FS] INFO 20:42:36,720 VariantDataManager - Training with 5929 variants after standard deviation thresholding. INFO 20:42:36,729 GaussianMixtureModel - Initializing model with 100 k-means iterations... INFO 20:42:37,020 VariantRecalibratorEngine - Finished iteration 0. INFO 20:42:37,163 VariantRecalibratorEngine - Finished iteration 5. Current change in mixture coefficients = 0.32548 INFO 20:42:37,262 VariantRecalibratorEngine - Finished iteration 10. Current change in mixture coefficients = 0.40180 INFO 20:42:37,360 VariantRecalibratorEngine - Finished iteration 15. Current change in mixture coefficients = 0.24424 INFO 20:42:37,459 VariantRecalibratorEngine - Finished iteration 20. Current change in mixture coefficients = 0.30560 INFO 20:42:37,557 VariantRecalibratorEngine - Finished iteration 25. Current change in mixture coefficients = 0.07436 INFO 20:42:37,655 VariantRecalibratorEngine - Finished iteration 30. Current change in mixture coefficients = 0.02595 INFO 20:42:37,753 VariantRecalibratorEngine - Finished iteration 35. Current change in mixture coefficients = 0.01395 INFO 20:42:37,851 VariantRecalibratorEngine - Finished iteration 40. Current change in mixture coefficients = 0.01163 INFO 20:42:37,948 VariantRecalibratorEngine - Finished iteration 45. Current change in mixture coefficients = 0.01117 INFO 20:42:38,046 VariantRecalibratorEngine - Finished iteration 50. Current change in mixture coefficients = 0.02233 INFO 20:42:38,144 VariantRecalibratorEngine - Finished iteration 55. Current change in mixture coefficients = 0.01832 INFO 20:42:38,242 VariantRecalibratorEngine - Finished iteration 60. Current change in mixture coefficients = 0.01221 INFO 20:42:38,340 VariantRecalibratorEngine - Finished iteration 65. Current change in mixture coefficients = 0.01370 INFO 20:42:38,438 VariantRecalibratorEngine - Finished iteration 70. Current change in mixture coefficients = 0.04717 INFO 20:42:38,536 VariantRecalibratorEngine - Finished iteration 75. Current change in mixture coefficients = 0.01022 INFO 20:42:38,634 VariantRecalibratorEngine - Finished iteration 80. Current change in mixture coefficients = 0.00365 INFO 20:42:38,731 VariantRecalibratorEngine - Finished iteration 85. Current change in mixture coefficients = 0.00875 INFO 20:42:38,829 VariantRecalibratorEngine - Finished iteration 90. Current change in mixture coefficients = 0.00391 INFO 20:42:38,927 VariantRecalibratorEngine - Finished iteration 95. Current change in mixture coefficients = 0.00411 INFO 20:42:39,005 VariantRecalibratorEngine - Convergence after 99 iterations! INFO 20:42:39,037 VariantRecalibratorEngine - Evaluating full set of 12403 variants... INFO 20:42:39,038 VariantDataManager - Training with worst 0 scoring variants --> variants with LOD <= -5.0000.

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Unable to retrieve result at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:190) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91) Caused by: java.lang.NullPointerException at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:83) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:359) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:139) at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.notifyTraversalDone(HierarchicalMicroScheduler.java:226) at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:183) ... 5 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.8-1-g932cd3a):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Unable to retrieve result
ERROR ------------------------------------------------------------------------------------------

`


Created 2014-03-12 15:22:35 | Updated | Tags: haplotypecaller error rnaseq
Comments (10)

Hi,

I was trying to call variants in RNAseq data using GATK 3.0 when I got the following stack trace:

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
java.lang.NullPointerException
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.PairHMMLikelihoodCalculationEngine.computeDiploidHaplotypeLikelihoods(PairHMMLikelihoodCalculationEngine.java:421)
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.PairHMMLikelihoodCalculationEngine.computeDiploidHaplotypeLikelihoods(PairHMMLikelihoodCalculationEngine.java:395)
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.calculateGLsForThisEvent(GenotypingEngine.java:385)
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.assignGenotypeLikelihoods(GenotypingEngine.java:222)
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:872)
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:141)
        at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708)
        at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704)
        at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler$ReadMapReduceJob.run(NanoScheduler.java:471)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version nightly-2014-03-10-gf78001a):
##### ERROR
##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Code exception (see stack trace for error itself)
##### ERROR ------------------------------------------------------------------------------------------

Here are the command line arguments:

Program Args: -T HaplotypeCaller -I in.bam -R ref.fa -o raw.snps.indels.vcf -nct 8 -recoverDanglingHeads -dontUseSoftClippedBases -stand_call_conf 20 -stand_emit_conf 20

As you can see, I got the error above from one of the nightly builds. Before that I also tried version 3.0-0-g6bad1c6, and this produced the exact same error. What's curious about this is that it didn't fail in the same place each time. I did this on 20 samples, and for the first run, 15 of the samples failed with this error. One of the samples failed after 7 minutes, so I decided to try that one again to see if I could reproduce it, but it went past the point (both in time and genomic position) where it failed the first time.

I decided to download a nightly build (version nightly-2014-03-10-gf78001a) and see if this had been fixed, but again, 15 of the samples failed. However, it was not the same set of samples that failed as with the other version.

The reads were aligned using STAR, and prior to this step I ran SplitNCigarReads and IndelRealigner.

Thanks, Niklas


Created 2014-02-11 18:50:32 | Updated | Tags: variantrecalibrator error
Comments (19)

Hi,

I'm now at the VQSR step of the best practices and to my surprise I got the following error related to java (I think):

Error: Could not find or load main class –jar

Here is my command line:

java –jar GenomeAnalysisTK.jar –T VariantRecalibrator –R ../human_g1k_v37.fasta –input ../raw_variants.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 ../hapmap_3.3.b37.vcf -resource:omni,known=false,training=true,truth=false,prior=12.0 ../1000G_omni2.5.b37.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 ../dbsnp_138.b37.vcf -resource:1000G,known=false,training=true,truth=false,prior=10.0 ../1000G_phase1.snps.high_confidence.b37.vcf -an QD -an MQRankSum -an ReadPosRankSum -an FS --maxGaussians 4 –mode SNP -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 –recalFile raw.SNPs.recal –tranchesFile raw.SNPs.tranches -rscriptFile recal.plots.R

I don't understand what is the problem. Could someone look at it and identify if I made a mistake in my command line?

Thank you for your support


Created 2014-02-05 11:33:30 | Updated | Tags: indelrealigner unifiedgenotyper java error
Comments (1)

Hi All We are running into some random weirdness when running jobs using SGE, GATK version 2.7-2-g6bda569, pretty much all GATK tools - but mostly IndelRealigner abd UnifiedGenotyper, we often get the following error:-

ERROR MESSAGE: Couldn't read file /scratch/project/pipelines/novorecal.bam because java.io.FileNotFoundException: /scratch/project/pipelines/novorecal.bam (No such file or directory)

This also happens for supplied reference genomes and vcf files. The GATK tool cant find them.

These "missing" files do exist, and have often even been created by the previous tool/step in the pipeline.

When we re-run the pipeline on a failed sample, it works. So we end up having to re-run our pipeline on the same set of samples multiple times and are beginning to find this very frustrating. These errors seem to be random, I cant find any pattern, and as I mentioned, when we re-run the pipeline on a failed run, it work without a hitch.

Has anyone experienced this? And if so, any recommendations?

Please help

Steve


Created 2013-12-20 15:12:59 | Updated 2013-12-20 15:53:02 | Tags: dependencies error build
Comments (4)

Hi, this took me a while to debug, so I'm posting the solution here. I started by downloading a clean copy of GATK core platform from GitHub. When I first tried building by running ant, I got the compiler errors below. The reason turned out to be that an unrelated jar (gsea2-2.0.12.jar) was on my CLASSPATH (this is from another Broad tool I've been using - Gene Set Enrichment Analysis). gsea2-2.0.12.jar apparently contains outdated versions of apache math and io packages which conflict with the GATK versions. Taking this jar off my CLASSPATH fixed the issue.

-Ben

Ps. the compiler errors were:

gatk.compile.internal.source:
    [javac] Compiling 681 source files to /prog/GATK/gatk_platform_git/build/java/classes
    [javac] /prog/GATK/gatk_platform_git/public/java/src/org/broadinstitute/sting/commandline/ParsingEngine.java:260: error: incompatible types
    [javac]         for (String line: FileUtils.readLines(file))
    [javac]                                              ^
    [javac]   required: String
    [javac]   found:    Object
    [javac] /prog/GATK/gatk_platform_git/public/java/src/org/broadinstitute/sting/utils/MannWhitneyU.java:50: error: no suitable constructor found for NormalDistributionImpl(double,double,double)
    [javac]     private static NormalDistribution APACHE_NORMAL = new NormalDistributionImpl(0.0,1.0,1e-2);
    [javac]                                                       ^
    [javac]     constructor NormalDistributionImpl.NormalDistributionImpl() is not applicable
    [javac]       (actual and formal argument lists differ in length)
    [javac]     constructor NormalDistributionImpl.NormalDistributionImpl(double,double) is not applicable
    [javac]       (actual and formal argument lists differ in length)
    [javac] Note: Some input files use or override a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.
    [javac] Note: Some input files use unchecked or unsafe operations.
    [javac] Note: Recompile with -Xlint:unchecked for details.
    [javac] 2 errors

BUILD FAILED
/prog/GATK/gatk_platform_git/build.xml:454: Compile failed; see the compiler error output for details.

Created 2013-11-08 10:45:37 | Updated | Tags: reducereads error picard
Comments (8)

Hi guys, I've seen this error has been reported other times, for different reasons. The thing is that, the bam file I'm using to reduce the reads has been processed through GATK pipeline without problems, realignment and recalibration included. Therefore, I assumed the bam file generated after BQSR would be GATK-compliant. I was running with Queue, so I just run here the exact command passed to the job in an interactive mode, to see what happens.

Here below is the full command and error message (apologies for lengthy output), where there's no stack trace after the error.

        [fles@login07 reduced]$ 'java'  '-Xmx12288m'  '-Djava.io.tmpdir=/scratch/scratch/fles/project_analysis/reduced/tmp'  '-cp' '/home/fles/applications/Queue-2.7-4-g6f46d11/Queue.jar'  'org.broadinstitute.sting.gatk.CommandLineGATK'  '-T' 'ReduceReads'  '-I' '/home/fles/Scratch/project_analysis/recalibrated/projectTrios.U1_PJ5208467.clean.dedup.recal.bam'  '-R' '/home/fles/Scratch/gatkbundle_2.5/human_g1k_v37.fasta'  '-o' '/scratch/scratch/fles/project_analysis/reduced/projectTrios.U1_PJ5208467.recal.reduced.bam'
        INFO  09:27:21,728 HelpFormatter - -------------------------------------------------------------------------------- 
        INFO  09:27:21,730 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-4-g6f46d11, Compiled 2013/10/10 17:29:52 
        INFO  09:27:21,731 HelpFormatter - Copyright (c) 2010 The Broad Institute 
        INFO  09:27:21,731 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
        INFO  09:27:21,735 HelpFormatter - Program Args: -T ReduceReads -I /home/fles/Scratch/project_analysis/recalibrated/projectTrios.U1_PJ5208467.clean.dedup.recal.bam -R /home/fles/Scratch/gatkbundle_2.5/human_g1k_v37.fasta -o /scratch/scratch/fles/project_analysis/reduced/projectTrios.U1_PJ5208467.recal.reduced.bam 
        INFO  09:27:21,735 HelpFormatter - Date/Time: 2013/11/08 09:27:21 
        INFO  09:27:21,735 HelpFormatter - -------------------------------------------------------------------------------- 
        INFO  09:27:21,735 HelpFormatter - -------------------------------------------------------------------------------- 
        INFO  09:27:34,156 GenomeAnalysisEngine - Strictness is SILENT 
        INFO  09:27:34,491 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 40 
        INFO  09:27:34,503 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
        INFO  09:27:34,627 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.12 
        INFO  09:27:35,039 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files 
        INFO  09:27:35,045 GenomeAnalysisEngine - Done preparing for traversal 
        INFO  09:27:35,045 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
        INFO  09:27:35,046 ProgressMeter -        Location processed.reads  runtime per.1M.reads completed total.runtime remaining 
        INFO  09:27:35,080 ReadShardBalancer$1 - Loading BAM index data 
        INFO  09:27:35,081 ReadShardBalancer$1 - Done loading BAM index data 
        INFO  09:28:05,059 ProgressMeter -      1:18958138        1.00e+06   30.0 s       30.0 s      0.6%        81.8 m    81.3 m 
        INFO  09:28:35,069 ProgressMeter -      1:46733396        2.30e+06   60.0 s       26.0 s      1.5%        66.4 m    65.4 m 
        INFO  09:29:05,079 ProgressMeter -      1:92187730        3.50e+06   90.0 s       25.0 s      3.0%        50.5 m    49.0 m 
        INFO  09:29:35,088 ProgressMeter -     1:145281942        4.90e+06  120.0 s       24.0 s      4.7%        42.7 m    40.7 m 
        INFO  09:30:05,098 ProgressMeter -     1:152323864        6.40e+06    2.5 m       23.0 s      4.9%        50.9 m    48.4 m 
        INFO  09:30:35,893 ProgressMeter -     1:181206886        7.70e+06    3.0 m       23.0 s      5.8%        51.4 m    48.4 m 
        INFO  09:31:05,902 ProgressMeter -     1:217604563        8.90e+06    3.5 m       23.0 s      7.0%        49.9 m    46.4 m 
        INFO  09:31:35,913 ProgressMeter -      2:14782401        1.02e+07    4.0 m       23.0 s      8.5%        47.0 m    43.0 m 
        INFO  09:32:05,922 ProgressMeter -      2:62429207        1.15e+07    4.5 m       23.0 s     10.0%        44.8 m    40.3 m 
        INFO  09:32:35,931 ProgressMeter -      2:97877374        1.28e+07    5.0 m       23.0 s     11.2%        44.7 m    39.7 m 
        INFO  09:33:06,218 ProgressMeter -     2:135574018        1.42e+07    5.5 m       23.0 s     12.4%        44.5 m    38.9 m 
        INFO  09:33:36,227 ProgressMeter -     2:179431307        1.56e+07    6.0 m       23.0 s     13.8%        43.5 m    37.5 m 
        INFO  09:34:06,237 ProgressMeter -     2:216279690        1.69e+07    6.5 m       23.0 s     15.0%        43.4 m    36.9 m 
        INFO  09:34:36,248 ProgressMeter -      3:14974731        1.81e+07    7.0 m       23.0 s     16.4%        42.9 m    35.9 m 
        INFO  09:35:07,073 ProgressMeter -      3:52443620        1.94e+07    7.5 m       23.0 s     17.6%        42.9 m    35.4 m 
        INFO  09:35:37,084 ProgressMeter -     3:111366536        2.05e+07    8.0 m       23.0 s     19.5%        41.3 m    33.2 m 
        INFO  09:36:07,094 ProgressMeter -     3:155571144        2.18e+07    8.5 m       23.0 s     20.9%        40.8 m    32.3 m 
        INFO  09:36:37,103 ProgressMeter -       4:3495327        2.31e+07    9.0 m       23.0 s     22.4%        40.4 m    31.3 m 
        INFO  09:37:07,114 ProgressMeter -      4:48178306        2.43e+07    9.5 m       23.0 s     23.8%        40.0 m    30.5 m 
        INFO  09:37:37,270 ProgressMeter -     4:106747046        2.56e+07   10.0 m       23.0 s     25.7%        39.0 m    29.0 m 
        INFO  09:38:07,483 ProgressMeter -     4:181303657        2.69e+07   10.5 m       23.0 s     28.1%        37.5 m    26.9 m 
        INFO  09:38:37,493 ProgressMeter -      5:41149454        2.81e+07   11.0 m       23.0 s     29.7%        37.1 m    26.1 m 
        INFO  09:38:51,094 GATKRunReport - Uploaded run statistics report to AWS S3 
        ##### ERROR ------------------------------------------------------------------------------------------
        ##### ERROR A USER ERROR has occurred (version 2.7-4-g6f46d11): 
        ##### ERROR
        ##### ERROR This means that one or more arguments or inputs in your command are incorrect.
        ##### ERROR The error message below tells you what is the problem.
        ##### ERROR
        ##### ERROR If the problem is an invalid argument, please check the online documentation guide
        ##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
        ##### ERROR
        ##### ERROR Visit our website and forum for extensive documentation and answers to 
        ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
        ##### ERROR
        ##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
        ##### ERROR
        ##### ERROR MESSAGE: SAM/BAM file /home/fles/Scratch/project_analysis/recalibrated/projectTrios.U1_PJ5208467.clean.dedup.recal.bam is malformed: Read error; BinaryCodec in readmode; file: /home/fles/Scratch/project_analysis/recalibrated/projectTrios.U1_PJ5208467.clean.dedup.recal.bam
        ##### ERROR ------------------------------------------------------------------------------------------

Following your usual advice, I validated the bam file produced by BQSR with Picard and I get the exact same error, but no specific error indication

        [fles@login07 reduced]$ java -jar ~/applications/picard-tools-1.102/ValidateSamFile.jar \
        > INPUT=/home/fles/Scratch/project_analysis/recalibrated/projectTrios.U1_PJ5208467.clean.dedup.recal.bam \
        > IGNORE_WARNINGS=TRUE
        [Fri Nov 08 09:59:42 GMT 2013] net.sf.picard.sam.ValidateSamFile INPUT=/home/fles/Scratch/project_analysis/recalibrated/projectTrios.U1_PJ5208467.clean.dedup.recal.bam IGNORE_WARNINGS=true    MAX_OUTPUT=100 VALIDATE_INDEX=true IS_BISULFITE_SEQUENCED=false MAX_OPEN_TEMP_FILES=8000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
        [Fri Nov 08 09:59:42 GMT 2013] Executing as fles@login07 on Linux 2.6.18-194.11.4.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_45-b18; Picard version: 1.102(1591)
        INFO    2013-11-08 10:01:01 SamFileValidator    Validated Read    10,000,000 records.  Elapsed time: 00:01:18s.  Time for last 10,000,000:   78s.  Last read position: 1:204,966,172
        INFO    2013-11-08 10:02:19 SamFileValidator    Validated Read    20,000,000 records.  Elapsed time: 00:02:36s.  Time for last 10,000,000:   78s.  Last read position: 2:232,121,396
        INFO    2013-11-08 10:03:36 SamFileValidator    Validated Read    30,000,000 records.  Elapsed time: 00:03:54s.  Time for last 10,000,000:   77s.  Last read position: 4:123,140,629
        [Fri Nov 08 10:04:00 GMT 2013] net.sf.picard.sam.ValidateSamFile done. Elapsed time: 4.30 minutes.
        Runtime.totalMemory()=300941312
        To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
        Exception in thread "main" net.sf.samtools.util.RuntimeIOException: Read error; BinaryCodec in readmode; file: /home/fles/Scratch/project_analysis/recalibrated/projectTrios.U1_PJ5208467.clean.dedup.recal.bam
            at net.sf.samtools.util.BinaryCodec.readBytesOrFewer(BinaryCodec.java:397)
            at net.sf.samtools.util.BinaryCodec.readBytes(BinaryCodec.java:371)
            at net.sf.samtools.util.BinaryCodec.readBytes(BinaryCodec.java:357)
            at net.sf.samtools.BAMRecordCodec.decode(BAMRecordCodec.java:200)
            at net.sf.samtools.BAMFileReader$BAMFileIterator.getNextRecord(BAMFileReader.java:558)
            at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:532)
            at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:522)
            at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:481)
            at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:687)
            at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:665)
            at net.sf.picard.sam.SamFileValidator.validateSamRecordsAndQualityFormat(SamFileValidator.java:241)
            at net.sf.picard.sam.SamFileValidator.validateSamFile(SamFileValidator.java:177)
            at net.sf.picard.sam.SamFileValidator.validateSamFileSummary(SamFileValidator.java:104)
            at net.sf.picard.sam.ValidateSamFile.doWork(ValidateSamFile.java:164)
            at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
            at net.sf.picard.sam.ValidateSamFile.main(ValidateSamFile.java:100)
        Caused by: java.io.IOException: Unexpected compressed block length: 1
            at net.sf.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:358)
            at net.sf.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:113)
            at net.sf.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:238)
            at java.io.DataInputStream.read(DataInputStream.java:149)
            at net.sf.samtools.util.BinaryCodec.readBytesOrFewer(BinaryCodec.java:395)

any suggestions on what I might do wrong?


Created 2013-09-30 10:15:57 | Updated | Tags: haplotypecaller error
Comments (21)

Dear All,

I've encountered the following error while processing one of the regions from an interval file that I want to re-discover/genotype with the HC. Note that I've processed the other 4.2mln regions without any problems. A quick search on the forum did not lead to any results. Let me know if you'd like more information!

Command:

~/tools/jdk1.7.0_25/bin/java -Xmx8g \ -jar ~/tools/GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar \ -T HaplotypeCaller \ -L ~/gonl/projects/SV/ug/gonl.union_pindel_ug_clever.sites.2.vcf.gz \ -L 1:237759920-238000001 \ -nct 6 \ -isr INTERSECTION \ -o ~/results/trio-analysis/hc/1_237759920-238000001.vcf \ -R /target/gpfs2/gcc/resources/hg19/indices/human_g1k_v37.fa \ -I ~/gonl/projects/trio-analysis/resources/bqsr2.bams.list \ -XL ~/gonl/projects/accessibleGenome/results/ALL.accessible.out.mask.intervals \ -minPruning 5 \ 2>&1 | tee /target/gpfs2/gcc/home/lfrancioli/logs/trio-analysis/hc/1_237759920-238000001.out

Error:

`##### ERROR ------------------------------------------------------------------------------------------

ERROR stack trace

java.lang.IllegalStateException: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.002992 at org.broadinstitute.sting.utils.pairhmm.PairHMM.computeReadLikelihoodGivenHaplotypeLog10(PairHMM.java:131) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.LikelihoodCalculationEngine.computeReadLikelihoods(LikelihoodCalculationEngine.java:262) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.LikelihoodCalculationEngine.computeReadLikelihoods(LikelihoodCalculationEngine.java:205) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:770) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:140) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:273) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.7-2-g6bda569):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: PairHMM Log Probability cannot be greater than 0: haplotype: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], read: [84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84], result: 0.002992`
ERROR ------------------------------------------------------------------------------------------

Created 2013-09-11 16:54:10 | Updated | Tags: vcf error
Comments (1)

I am trying to liftover from NIST b37 to hg19. I have all the files I need and I can kick off the liftover just fine, but I keep running into problems because the NIST vcf has tags in the variant line INFO field that are not in the header. ##### ERROR MESSAGE: Key PLHSWG found in VariantContext field INFO at chr1:52238 but this key isn't defined in the VCFHeader. We require all VCFs to have complete VCF headers by default

I identified about 90 tags that are not properly documented in the header. Is there a way to ignore all of these INFO header lapses?


Created 2013-08-26 12:52:23 | Updated | Tags: baserecalibrator error buffer
Comments (7)

Hi, I have an error in the step BaseRecalibrator and even increasing the memory allocated to the job, I still have the same error and nothing found on previous published posts :

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Insufficient buffer size for Xs overhanging genome -- expand BUFFER at org.broadinstitute.sting.gatk.datasources.providers.ReferenceView.getReferenceBases(ReferenceView.java:121) at org.broadinstitute.sting.gatk.datasources.providers.ReadReferenceView$Provider.getBases(ReadReferenceView.java:87) at org.broadinstitute.sting.gatk.contexts.ReferenceContext.fetchBasesFromProvider(ReferenceContext.java:145) at org.broadinstitute.sting.gatk.contexts.ReferenceContext.getBases(ReferenceContext.java:189) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.calculateIsSNP(BaseRecalibrator.java:335) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:253) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:132) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsMap.apply(TraverseReadsNano.java:228) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsMap.apply(TraverseReadsNano.java:216) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:102) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:56) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:108) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:311) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.6-5-gba531bd):
ERROR
ERROR Please check the documentation guide to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Insufficient buffer size for Xs overhanging genome -- expand BUFFER
ERROR ------------------------------------------------------------------------------------------

Thank you in advance


Created 2013-05-21 18:08:59 | Updated | Tags: tutorials error
Comments (3)

on the forum page

http://gatkforums.broadinstitute.org/discussion/1209/how-to-run-the-gatk-for-the-first-time#latest

there are two examples. The first runs fine. The second generates this error

MESSAGE: Bad input: We encountered a non-standard non-IUPAC base in the provided reference: '10'

but the input files are the same. I only changed "Reads" to "Loci" in the command. I am running Unix so I do not need to retype the entire command. This command works fine

java -jar GenomeAnalysisTK.jar -T CountReads -R exampleFASTA.fasta -I exampleBAM.bam

This command produces the error

java -jar GenomeAnalysisTK.jar -T CountLoci -R exampleFASTA.fasta -I exampleBAM.bam -o output.txt

Any suggestions?


Created 2013-04-03 17:27:26 | Updated | Tags: gatk error bqrs
Comments (3)

Hi all, I'm trying to perform BQRS on a bam file I have. Unfortunately I get this error:

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (0) > (-1) STOP -- this should never happen, please check read: HWI-ST1296:110:C1P16ACXX:8:2116:15747:68300 2/2 101b aligned read. (CIGAR: 5D74M27S)

The culprit appears to be this read pair:

HWI-ST1296:110:C1P16ACXX:8:2116:15747:68300 147 chr2    230419662   24  5D74M27S    =   230419667   -74 GAAGGGAAGGGAAGGGAAGGGAAGGGAAGGGAAGGGAAGGGTGAAAGGAAGGGAAAAGAAAAAGGAAAGGAAGGCAATCCCTGCCCAGGTTCTTAATTTTC   #####A>:3CC>;=>DC@;>66;.3@DAA>CA@77.)((./))@FBB.<<??/9*0*******00*?1***1*1)1)<)B3++2+2+22++2222+4++B=   PG:Z:MarkDuplicates RG:Z:GB_L008.1  NM:i:9  AS:i:43 XS:i:45
HWI-ST1296:110:C1P16ACXX:8:2116:15747:68300 99  chr2    230419667   53  83M18S  =   230419662   74  GAAGGGAAGGGAAGGGAAGGGAAGGGAAGGGAAGGGAAGGGAGAAAGGAAGGAAAAAGAGAAAGGGAAGGAAGGAAATTCATGCTCAGTATCTAATTTTTA   ??;ADDDDHBF3DA=GBGB@D;EFC<3CDHBHICDEGEGIE=@@A@D@;C;;2?@7;=7>>;5>;;@B1<@CB<?>>::4++>@@>CC44>>@DC(:CA##   PG:Z:MarkDuplicates RG:Z:GB_L008.1  NM:i:0  AS:i:83 XS:i:50

Program arguments were:

-R /lustre1/genomes/hg19/fa/hg19.fa -knownSites /lustre1/genomes/hg19/annotation/dbSNP-137.chr.vcf -I GB_dedup.realign.bam -T BaseRecalibrator --covariate QualityScoreCovariate --covariate CycleCovariate --covariate ContextCovariate --covariate ReadGroupCovariate --unsafe ALLOW_SEQ_DICT_INCOMPATIBILITY -nct 64 -o jobdir/GB.grp

I'm running the latest nightbuild of GATK. Any hint is very much appreciated

d


Created 2013-03-26 20:46:26 | Updated 2013-03-27 16:21:14 | Tags: baserecalibrator error start stop
Comments (30)

Hello,

sorry if i missed the same problem in other threads in the forum... but we are having trouble running BaseRecalibrator in a sample and i couldn't find the solution.

I tried many steps and here is what i've found until now:

1 - Other samples work fine

2 - Running picard ValidateSamFile in realigned.bam (after IndelRealigner) gives many erros :
2a - Mate negative strand flag does not match read negative strand flag of mate
2b - Mate alignment does not match alignment start of mate
3c - Value was put into PairInfoMap more than once. (fatal)

3 - Running BaseRecalibrator with option -L 1:428-249250621 works fine!

After the fact that -L works fine i discarded the problem in vcf files and reference file. I don't know how to go further in this investigation since GATK 1 realined.bam also gives me the errors in (2) and those error are peanuts comparing the total number of reads.

The big difference here is that we're are using bwa7.

Any ideas? Thanks!

(i'm filtering out "secondary hits" given by bwa7 and will update this thread, if it works it may be helpful in the future)

GATK output:

INFO 14:11:47,441 HelpFormatter - -------------------------------------------------------------------------------- INFO 14:11:47,443 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.4-9-g532efad, Compiled 2013/03/19 07:35:36 INFO 14:11:47,443 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 14:11:47,443 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 14:11:47,447 HelpFormatter - Program Args: -nct 8 -T BaseRecalibrator -I /mnt/work/rlb/pac661825//OUT_661825.realigned.bam -R ../data/databases//1KGP/GRCh37_female_exome_mt1kg.fasta --knownSites ../data/databases//dbSNP/dbSNP_137/00-All.vcf -o /mnt/work/rlb/pac661825//OUT_661825.grp INFO 14:11:47,447 HelpFormatter - Date/Time: 2013/03/26 14:11:47 INFO 14:11:47,447 HelpFormatter - -------------------------------------------------------------------------------- INFO 14:11:47,447 HelpFormatter - -------------------------------------------------------------------------------- INFO 14:11:47,458 ArgumentTypeDescriptor - Dynamically determined type of ../data/databases/dbSNP/dbSNP_137/00-All.vcf to be VCF INFO 14:11:47,500 GenomeAnalysisEngine - Strictness is SILENT INFO 14:11:47,558 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 14:11:47,565 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 14:11:47,577 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01 INFO 14:11:47,587 RMDTrackBuilder - Loading Tribble index from disk for file ../data/databases/dbSNP/dbSNP_137/00-All.vcf INFO 14:11:47,704 MicroScheduler - Running the GATK in parallel mode with 8 total threads, 8 CPU thread(s) for each of 1 data thread(s), of 8 processors available on this machine INFO 14:11:47,745 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files INFO 14:11:47,750 GenomeAnalysisEngine - Done creating shard strategy INFO 14:11:47,750 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 14:11:47,750 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining INFO 14:11:47,773 BaseRecalibrator - The covariates being used here:
INFO 14:11:47,773 BaseRecalibrator - ReadGroupCovariate INFO 14:11:47,773 BaseRecalibrator - QualityScoreCovariate INFO 14:11:47,773 BaseRecalibrator - ContextCovariate INFO 14:11:47,774 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3 INFO 14:11:47,774 BaseRecalibrator - CycleCovariate INFO 14:11:47,776 ReadShardBalancer$1 - Loading BAM index data for next contig INFO 14:11:47,777 ReadShardBalancer$1 - Done loading BAM index data for next contig INFO 14:12:18,626 ProgressMeter - 1:15956928 1.10e+06 30.0 s 28.0 s 0.5% 95.1 m 94.6 m INFO 14:12:48,655 ProgressMeter - 1:34102053 2.70e+06 60.0 s 22.0 s 1.1% 89.0 m 88.0 m INFO 14:13:18,685 ProgressMeter - 1:59096606 4.50e+06 90.0 s 20.0 s 1.9% 77.1 m 75.6 m INFO 14:13:48,714 ProgressMeter - 1:103467532 5.90e+06 120.0 s 20.0 s 3.4% 58.7 m 56.7 m INFO 14:14:18,745 ProgressMeter - 1:153234111 7.50e+06 2.5 m 20.0 s 5.0% 49.5 m 47.0 m INFO 14:14:48,774 ProgressMeter - 1:172414433 9.30e+06 3.0 m 19.0 s 5.7% 53.1 m 50.1 m INFO 14:15:19,054 ProgressMeter - 1:208266349 1.10e+07 3.5 m 19.0 s 6.9% 51.3 m 47.8 m INFO 14:15:49,095 ProgressMeter - 1:247611815 1.27e+07 4.0 m 19.0 s 8.2% 49.3 m 45.2 m INFO 14:15:56,507 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (0) > (-1) STOP -- this should never happen -- call Mauricio! at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipByReferenceCoordinates(ReadClipper.java:537) at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipByReferenceCoordinatesLeftTail(ReadClipper.java:176) at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipAdaptorSequence(ReadClipper.java:389) at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipAdaptorSequence(ReadClipper.java:392) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:244) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:131) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsMap.apply(TraverseReadsNano.java:230) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsMap.apply(TraverseReadsNano.java:218) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler$ReadMapReduceJob.run(NanoScheduler.java:471) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.4-9-g532efad):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: START (0) > (-1) STOP -- this should never happen -- call Mauricio!
ERROR ------------------------------------------------------------------------------------------

Created 2013-02-20 11:39:12 | Updated | Tags: printreads error
Comments (1)

Hi to all I have a problem in understanding an error output produced by PrintReads tool. This is my command line(the recal.grp file is correctly produced by BaseRecalibrator tool):

java -jar /Archive/Software/GATK2.1-8/GenomeAnalysisTK.jar -T PrintReads -I /Path-to-BAMfile -BQSR /Path-to-recal.grp -o /Path-to-output

The error output is:

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 2.1-8-g5efb575):
ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
ERROR Please do not post this error to the GATK forum
ERROR
ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Walker requires a reference but none was provided.
ERROR ------------------------------------------------------------------------------------------

As apparently no further information are available I cannot understand what is the issue. Thank for the help! Giuliano


Created 2013-01-25 14:29:52 | Updated | Tags: error runtime
Comments (1)

INFO 17:40:38,765 ProgressMeter - chrX:147886444 3.21e+07 5.5 h 10.3 m 97.9% 5.6 h 7.1 m INFO 17:41:08,775 ProgressMeter - chrX:151653849 3.21e+07 5.5 h 10.3 m 98.0% 5.6 h 6.7 m INFO 17:41:38,785 ProgressMeter - chrX:153812787 3.22e+07 5.5 h 10.3 m 98.1% 5.6 h 6.5 m

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NoClassDefFoundError: net/sf/samtools/util/CloserUtil at net.sf.picard.util.PeekableIterator.close(PeekableIterator.java:46) at net.sf.picard.sam.MergingSamRecordIterator.addIfNotEmpty(MergingSamRecordIterator.java:169) at net.sf.picard.sam.MergingSamRecordIterator.next(MergingSamRecordIterator.java:125) at net.sf.picard.sam.MergingSamRecordIterator.next(MergingSamRecordIterator.java:39) at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:100) at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:84) at org.broadinstitute.sting.gatk.datasources.reads.SAMDataSource$ReleasingIterator.next(SAMDataSource.java:1091) at org.broadinstitute.sting.gatk.datasources.reads.SAMDataSource$ReleasingIterator.next(SAMDataSource.java:1057) at org.broadinstitute.sting.gatk.filters.CountingFilteringIterator.getNextRecord(CountingFilteringIterator.java:105) at org.broadinstitute.sting.gatk.filters.CountingFilteringIterator.next(CountingFilteringIterator.java:81) at org.broadinstitute.sting.gatk.filters.CountingFilteringIterator.next(CountingFilteringIterator.java:41) at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:100) at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:84) at org.broadinstitute.sting.gatk.iterators.VerifyingSamIterator.next(VerifyingSamIterator.java:24) at org.broadinstitute.sting.gatk.iterators.VerifyingSamIterator.next(VerifyingSamIterator.java:12) at org.broadinstitute.sting.utils.baq.ReadTransformingIterator.next(ReadTransformingIterator.java:36) at org.broadinstitute.sting.utils.baq.ReadTransformingIterator.next(ReadTransformingIterator.java:15) at net.sf.picard.util.PeekableIterator.advance(PeekableIterator.java:71) at net.sf.picard.util.PeekableIterator.next(PeekableIterator.java:57) at org.broadinstitute.sting.gatk.datasources.reads.ReadShard.fill(ReadShard.java:135) at org.broadinstitute.sting.gatk.datasources.reads.ReadShardBalancer$1.advance(ReadShardBalancer.java:153) at org.broadinstitute.sting.gatk.datasources.reads.ReadShardBalancer$1.next(ReadShardBalancer.java:116) at org.broadinstitute.sting.gatk.datasources.reads.ReadShardBalancer$1.next(ReadShardBalancer.java:75) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:65) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91) Caused by: java.lang.ClassNotFoundException: net.sf.samtools.util.CloserUtil at java.net.URLClassLoader$1.run(URLClassLoader.java:199) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) ... 29 more Caused by: java.util.zip.ZipException: error reading zip file at java.util.zip.ZipFile.read(Native Method) at java.util.zip.ZipFile.access$1200(ZipFile.java:29) at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:447) at java.util.zip.ZipFile$1.fill(ZipFile.java:230) at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:141) at java.util.jar.Manifest$FastInputStream.fill(Manifest.java:422) at java.util.jar.Manifest$FastInputStream.readLine(Manifest.java:358) at java.util.jar.Manifest$FastInputStream.readLine(Manifest.java:390) at java.util.jar.Attributes.read(Attributes.java:359) at java.util.jar.Manifest.read(Manifest.java:182) at java.util.jar.Manifest.(Manifest.java:52) at java.util.jar.JarFile.getManifestFromReference(JarFile.java:167) at java.util.jar.JarFile.getManifest(JarFile.java:148) at sun.misc.URLClassPath$JarLoader$2.getManifest(URLClassPath.java:696) at java.net.URLClassLoader.defineClass(URLClassLoader.java:228) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) ... 34 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.3-9-ge5ebf34):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: net/sf/samtools/util/CloserUtil
ERROR ------------------------------------------------------------------------------------------

Any Idea what is the solution??


Created 2013-01-01 21:31:25 | Updated 2013-01-07 19:17:35 | Tags: error programelementdoc
Comments (10)

Dear expert,

When I run gatk with fellow command,

java -Xmx10g -Xms10g -jar ~/bin/GenomeAnalysisTK-2.3-4-g57ea19f/GenomeAnalysisTK.jar \ -T BaseRecalibrator \ -R Homo_sapiens.GRCh37.63/bak/Homo_sapiens.GRCh37.63.dna.chromosome.fa \ -knownSites ~/bin/GenomeAnalysisTK-2.3-4-g57ea19f/dbsnp_137.hg19.vcf \ -I input_4.bam \ -o mySample_CovarTable_Recal.grp

I got this error:

INFO 22:25:22,825 RMDTrackBuilder - Creating Tribble index in memory for file ~/bin/GenomeAnalysisTK-2.3-4-g57ea19f/dbsnp_137.hg19.vcf

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NoClassDefFoundError: com/sun/javadoc/ProgramElementDoc ... ... ...

ERROR MESSAGE: com/sun/javadoc/ProgramElementDoc

Created 2012-11-22 10:26:18 | Updated | Tags: baserecalibrator error
Comments (2)

Hi,

I got this error today running BaseRecalibrator:

ERROR stack trace

java.lang.NullPointerException at java.util.concurrent.locks.AbstractQueuedSynchronizer.hasQueuedPredecessors(AbstractQueuedSynchronizer.java:1453) at java.util.concurrent.locks.ReentrantLock$FairSync.tryAcquire(ReentrantLock.java:240) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1158) at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340) at java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:244) at org.broadinstitute.sting.utils.nanoScheduler.Reducer.reduceAsMuchAsPossible(Reducer.java:121) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler$MapReduceJob.run(NanoScheduler.java:510) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636)

The command arguments I used are: -nct 4 -T BaseRecalibrator --intermediate_csv_file inter.csv -I realigned.bam -R Homo_sapiens.GRCh37.68.dna.chromosome.all.fasta -o recal_data.grp --plot_pdf_file recal.pdf -knownSites dbsnp_137.b37.vcf -knownSites Mills_and_1000G_gold_standard.indels.b37.vcf -knownSites 1000G_phase1.indels.b37.vcf --disable_indel_quals

This command has previously worked with other data using the same version of GATK.


Created 2012-11-13 10:13:24 | Updated | Tags: vqsr gatk error
Comments (8)

Hi all, I'm running VariantRecalibrator on a SNP set (47 exomes) and I get this error:

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 2.2-3-gde33222): 
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: NaN LOD value assigned. Clustering with this few variants and these annotations is unsafe. Please consider raising the number of variants used to train the negative model (via --percentBadVariants 0.05, for example) or lowering the maximum number of Gaussians to use in the model (via --maxGaussians 4, for example)
##### ERROR ------------------------------------------------------------------------------------------

this is the command line:

    java -Djava.io.tmpdir=/lustre2/scratch/  -Xmx32g -jar /lustre1/tools/bin/GenomeAnalysisTK-2.2-3.jar \
    -T VariantRecalibrator \
    -R /lustre1/genomes/hg19/fa/hg19.fa \
    -input /lustre1/workspace/Ferrari/Carrera/Analysis/UG/bpd_ug.SNP.vcf \
    -resource:hapmap,VCF,known=false,training=true,truth=true,prior=15.0 /lustre1/genomes/hg19/annotation/hapmap_3.3.hg19.sites.vcf.gz \
    -resource:omni,VCF,known=false,training=true,truth=false,prior=12.0 /lustre1/genomes/hg19/annotation/1000G_omni2.5.hg19.sites.vcf.gz \
    -resource:dbsnp,VCF,known=true,training=false,truth=false,prior=6.0 /lustre1/genomes/hg19/annotation/dbSNP-137.chr.vcf -an QD \
    -an HaplotypeScore \
    -an MQRankSum \
    -an ReadPosRankSum \
    -an FS \
    -an MQ \
    -an DP \
    -an QD \
    -an InbreedingCoeff \
    -mode SNP \
    -recalFile /lustre2/scratch/Carrera/Analysis2/snp.ug.recal.csv \
    -tranchesFile /lustre2/scratch/Carrera/Analysis2/snp.ug.tranches \
    -rscriptFile /lustre2/scratch/Carrera/Analysis2/snp.ug.plot.R \
    -U ALLOW_SEQ_DICT_INCOMPATIBILITY \
    --maxGaussians 6

I've already tried to decrease the --maxGaussians option to 4, I've also added --percentBad option (setting it up to 0.12, as for INDEL) but I still get the error. I've added the option -debug to see what's happening, but apparently this has been removed from GATK-2.2. Any help is appreciated... thanks


Created 2012-11-05 08:12:48 | Updated | Tags: unifiedgenotyper qualbydepth variantannotator error
Comments (14)

I had annotated raw indel file (given by UnifiedGenotyper), 1000G_omni2.5.b37.sites.vcf and hapmap_3.3.b37.sites.vcf with all possible annotations including QD (QualByDepth) using VariantAnnotator. However, i got an error when i tried to run VariantRecalibrator. It was complaing that QD has not been found in training variant. Is QD important annotation for indel filtering. Can it be ignored ?

P.S. - i did not use sample bam file while annotating training data set.

.
.
.
INFO  15:11:55,999 RMDTrackBuilder - Loading Tribble index from disk for file NCBI_dbsnp_for_GATK.vcf
INFO  15:12:21,650 TraversalEngine -  chr1:128346793        1.98e+07   30.0 s        1.5 s      4.1%        12.1 m    11.6 m
INFO  15:12:51,650 TraversalEngine -  chr9:130658800        5.26e+07   60.0 s        1.1 s     53.9%       111.2 s    51.2 s
INFO  15:13:13,618 VariantDataManager - QD:      mean = NaN      standard deviation = NaN
INFO  15:13:16,417 GATKRunReport - Uploaded run statistics report to AWS S3
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 2.1-13-g1706365):
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Bad input: Values for QD annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations. See http://www.broadinstitute.org/gsa/wiki/index.php/VariantAnnotator
##### ERROR ------------------------------------------------------------------------------------------

Created 2012-11-04 10:16:50 | Updated 2012-11-05 02:28:44 | Tags: haplotypecaller reducereads dbsnp error reviewedstingexception
Comments (33)

Hello dear GATK Team,

when trying to run Haplotypecaller on my exome files prepared with ReduceReads i get the error stated below. As you can see the newest GATK Version is used. Also UnifiedGenotyper does not produce any errors on te exact same data (90 SOLiD exomes creatted according to Best Practice v4).

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Somehow the requested coordinate is not covered by the read. Too many deletions?
    at org.broadinstitute.sting.utils.sam.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:447)
    at org.broadinstitute.sting.utils.sam.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:396)
    at org.broadinstitute.sting.utils.sam.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:392)
    at org.broadinstitute.sting.gatk.walkers.annotator.DepthOfCoverage.annotate(DepthOfCoverage.java:56)
    at org.broadinstitute.sting.gatk.walkers.annotator.interfaces.InfoFieldAnnotation.annotate(InfoFieldAnnotation.java:24)
    at org.broadinstitute.sting.gatk.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:223)
    at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:429)
    at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:104)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegion(TraverseActiveRegions.java:249)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.callWalkerMapOnActiveRegions(TraverseActiveRegions.java:204)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegions(TraverseActiveRegions.java:179)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:136)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:29)
    at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146)
    at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 2.2-3-gde33222):
##### ERROR
##### ERROR Please visit the wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Somehow the requested coordinate is not covered by the read. Too many deletions?
##### ERROR ------------------------------------------------------------------------------------------

The Command line used (abbreviated):

java -Xmx30g -jar /home/common/GenomeAnalysisTK-2.2-3/GenomeAnalysisTK.jar \
  -R /home/common/hg19/ucschg19/ucsc.hg19.fasta \
  -T HaplotypeCaller \
  -I ReduceReads/XXXXX.ontarget.MarkDups.nRG.reor.Real.Recal.reduced.bam [x90]\
  --dbsnp /home/common/hg19/dbsnp_135.hg19.vcf \
  -o 93Ind_ped_reduced_HC_snps.raw.vcf \
  -ped familys.ped \
  --pedigreeValidationType SILENT \
  -stand_call_conf 20.0 \
  -stand_emit_conf 10.0

Created 2012-09-05 23:34:52 | Updated 2013-01-07 20:43:29 | Tags: phasebytransmission error
Comments (16)

Hi all,

Has anyone else gotten the following:

java.lang.NullPointerException at org.broadinstitute.sting.gatk.walkers.phasing.PhaseByTransmission.phaseTrioGenotypes(PhaseByTransmission.java:242) at org.broadinstitute.sting.gatk.walkers.phasing.PhaseByTransmission.map(PhaseByTransmission.java:306) at org.broadinstitute.sting.gatk.walkers.phasing.PhaseByTransmission.map(PhaseByTransmission.java:35) at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:78) at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:18) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:62) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:225) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:122) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:149) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

My command line was: java -jar GenomeAnalysisTK.jar -T PhaseByTransmission -V w01.sorted.vcf -o w01.phased.vcf -f "mom+dad=child" -R hg19.fa

Cheers,

Paul


Created 2012-08-26 06:56:33 | Updated 2013-01-07 20:44:36 | Tags: unifiedgenotyper tribble error
Comments (19)

Hi all,

I've been analyzing some illumina whole exome sequencing data these days. Yesterday I used GATK(version 2.0) UnifiedGenotyper to call snps and indels with the following commands:

run_gatk.sh -T UnifiedGenotyper -R GRCh37/human_g1k_v37.fasta -I GATK_recal_result.bam -glm BOTH --dbsnp reference/dbsnp_134.b37.vcf -stand_call_conf 50 -stand_emit_conf 10 -o raw2.vcf -dcov 200 --num_threads 10

After running theses commands, I got a vcf file which is very small(when I checked the vcf file, I found these called snps and indels are all from Chromosome1) The error message is as follows:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Unable to merge temporary Tribble output file. at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.mergeExistingOutput(HierarchicalMicroScheduler.java:269) at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:105) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:269) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93) Caused by: org.broad.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: /rd/tmp/org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub8005277156701491219.tmp (Too many open files), for input source: /rd/tmp/org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub8005277156701491219.tmp at org.broad.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:104) at org.broad.tribble.TribbleIndexedFeatureReader.(TribbleIndexedFeatureReader.java:58) at org.broad.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:69) at org.broadinstitute.sting.gatk.io.storage.VariantContextWriterStorage.mergeInto(VariantContextWriterStorage.java:182) at org.broadinstitute.sting.gatk.io.storage.VariantContextWriterStorage.mergeInto(VariantContextWriterStorage.java:52) at org.broadinstitute.sting.gatk.executive.OutputMergeTask.merge(OutputMergeTask.java:48) at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.mergeExistingOutput(HierarchicalMicroScheduler.java:263) ... 6 more Caused by: java.io.FileNotFoundException: /rd/tmp/org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub8005277156701491219.tmp (Too many open files) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:120) at org.broad.tribble.util.ParsingUtils.openInputStream(ParsingUtils.java:56) at org.broad.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:96) ... 12 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.0-39-gd091f72):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Unable to merge temporary Tribble output file.
ERROR ------------------------------------------------------------------------------------------

Would you please help me solve it ? Thanks a lot


Created 2012-08-25 06:24:57 | Updated 2013-01-07 20:45:01 | Tags: readgroup error
Comments (1)

ERROR MESSAGE: SAM/BAM file genome_110616_SN365_A_s_7_seq_GQJ-1.pe.bam is malformed: SAM file doesn't have any read groups defined in the header. The GATK no longer supports SAM files without read groups

i am very new to GATK and i was trying to invoke the readcount command and i got the error above what is read groups.

thank you