Hi, I am new to GATK, I have been trying to figure a strange error that I haven't been able to resolve for days.
Process so far. 1. Run UnifiedGenotyper per chr using -L option on ~ 130 samples 2. Merge all output vcf files into one. (using tabix to gz and index each vcf file, then use vcf-concat to merge all chr* files) 3. Use a perl script to sort merged vcf file based on the reference file order. i.e (chr1, 2, 3...M) 4. Split Merged.sorted.vcf file into INDEL and SNV files. 5. Run VQSR on each file (SNV and INDEL).
Error that I get: During ApplyRecalibration for INDELs I get an error in chr9 that states that a coordinate A is after Coordinate B (A < B, and A and B are different values, each time). This always happens in chr9. I checked my input Merged.sorted.indel.vcf file around coordinate A and B and its file is in order. I checked the recal file and it is also in order. So I can't figure out where the error is coming from. The strange thing is that error is reported when GATK is creating the output file, not during its computation/applying recalibration.
Has anyone encountered such a situation before? Or have any ideas I should try to resolve the error. I don't get any errors with SNVs only INDEL's
Exact error message:
org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Unable to merge temporary Tribble output file. at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.mergeExistingOutput(HierarchicalMicroScheduler.java:259) at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:103) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:248) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:92) Caused by: org.broad.tribble.TribbleException$MalformedFeatureFile: We saw a record with a start of chr9:33020249 after a record with a start of chr9:34987121, for input source: /data2/bsi/secondary/multisample/Merged.variant.filter.INDEL_2.vcf at org.broad.tribble.index.DynamicIndexCreator.addFeature(DynamicIndexCreator.java:164) at org.broadinstitute.sting.utils.codecs.vcf.IndexingVCFWriter.add(IndexingVCFWriter.java:118) at org.broadinstitute.sting.utils.codecs.vcf.StandardVCFWriter.add(StandardVCFWriter.java:163) at org.broadinstitute.sting.gatk.io.storage.VCFWriterStorage.mergeInto(VCFWriterStorage.java:120) at org.broadinstitute.sting.gatk.io.storage.VCFWriterStorage.mergeInto(VCFWriterStorage.java:26) at org.broadinstitute.sting.gatk.executive.OutputMergeTask.merge(OutputMergeTask.java:48) at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.mergeExistingOutput(HierarchicalMicroScheduler.java:253) ... 6 more
/usr/java/latest/bin/java -Xmx6g -XX:-UseGCOverheadLimit -Xms512m -jar /projects/apps/alignment/GenomeAnalysisTK/latest/GenomeAnalysisTK.jar -R /data2/reference/sequence/human/ncbi/37.1/allchr.fa -et NO_ET -K /projects/apps/alignment/GenomeAnalysisTK/latest/Hossain.Asif_mayo.edu.key -mode INDEL -T ApplyRecalibration -nt 4 -input /data2/secondary/multisample/Merged.variant.INDEL.vcf.temp -recalFile /data2/secondary/multisample/temp/Merged.variant.INDEL.recal -tranchesFile /data2/secondary/multisample/temp/Merged.variant.INDEL.tranches -o /data2/secondary/multisample/Merged.variant.filter.INDEL_2.vcf
Version of GATK : 1.7 and 1.6.7
I'm trying to run CountCovariates and getting the error as :
" java.lang.IllegalStateException: This method hasn't been implemented yet for GAIIx "
I have used following program arguments as given below :
Program Args: -T CountCovariates -R /home/adlab/ngs_data/databases/human_g1k_v37.fasta -I D5_b37_aligned_sorted_realn_DupRm.bam -knownSites:dbsnp,VCF /home/adlab/ngs_data/databases/dbsnp_132.b37.vcf -recalFile D5_b37_aligned_sorted_realn_DupRm_countcovariates.csv -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -standard
To trouble shoot it I found that I have added the read groups in bwa sampe by setting platform info as "GAIIx" as given below and it's root for the error.
What's the exact error in my argument list / read group syntax and What could be the cause of the error ? kindly help.
Hi, I want to run GATK in Galaxy and the required version is 1.4. Can you tell me where I can download this from? I've got the source for version 1.4 from github, so I can build it if need be, but I was wondering if there's a repository for older binaries. Many thanks, Graham