Failed in VariantRecalibrator for my customized target re-sequencing data?
Posted in Ask the GATK team | Last updated on


Comments (3)

I use GATK to process a dataset produce by customized target re-sequencing, but I failed in VariantRecalibrator step.

the run log: INFO 17:17:55,949 ArgumentTypeDescriptor - Dynamically determined type of ./hg19.liz.final.regions.bed to be BED INFO 17:17:56,001 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:17:56,001 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 INFO 17:17:56,001 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 17:17:56,001 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 17:17:56,007 HelpFormatter - Program Args: -T VariantRecalibrator -R ./hg19.fa -input ./output.raw.snps.indels.vcf -L./hg19.liz.final.regions.bed -resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.hg19.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=6.0 dbsnp_138.hg19.vcf -an QD -an MQRankSum -an ReadPosRankSum -an FS -an MQ -mode SNP -recalFile output.snp.recal -tranchesFile output.snp.tranches -rscriptFile output.snp.plots.R INFO 17:17:56,007 HelpFormatter - Date/Time: 2014/02/18 17:17:56 INFO 17:17:56,007 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:17:56,008 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:17:56,024 ArgumentTypeDescriptor - Dynamically determined type of ./output.raw.snps.indels.vcf to be VCF INFO 17:17:56,030 ArgumentTypeDescriptor - Dynamically determined type of hapmap_3.3.hg19.vcf to be VCF INFO 17:17:56,033 ArgumentTypeDescriptor - Dynamically determined type of dbsnp_138.hg19.vcf to be VCF INFO 17:17:56,619 GenomeAnalysisEngine - Strictness is SILENT INFO 17:17:56,684 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 INFO 17:17:56,703 RMDTrackBuilder - Loading Tribble index from disk for file ./output.raw.snps.indels.vcf INFO 17:17:56,728 RMDTrackBuilder - Loading Tribble index from disk for file hapmap_3.3.hg19.vcf INFO 17:17:56,764 RMDTrackBuilder - Loading Tribble index from disk for file dbsnp_138.hg19.vcf INFO 17:17:56,935 IntervalUtils - Processing 801109 bp from intervals INFO 17:17:57,012 GenomeAnalysisEngine - Preparing for traversal INFO 17:17:57,017 GenomeAnalysisEngine - Done preparing for traversal INFO 17:17:57,017 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 17:17:57,017 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 17:17:57,024 TrainingSet - Found hapmap track: Known = false Training = true Truth = true Prior = Q15.0 INFO 17:17:57,024 TrainingSet - Found dbsnp track: Known = true Training = false Truth = false Prior = Q6.0 INFO 17:17:59,530 VariantDataManager - QD: mean = 21.47 standard deviation = 9.62 INFO 17:17:59,530 VariantDataManager - MQRankSum: mean = -0.02 standard deviation = 1.36 INFO 17:17:59,531 VariantDataManager - ReadPosRankSum: mean = 0.50 standard deviation = 1.05 INFO 17:17:59,531 VariantDataManager - FS: mean = 2.59 standard deviation = 7.85 INFO 17:17:59,532 VariantDataManager - MQ: mean = 41.74 standard deviation = 1.10 INFO 17:17:59,538 VariantDataManager - Annotations are now ordered by their information content: [MQ, QD, FS, ReadPosRankSum, MQRankSum] INFO 17:17:59,539 VariantDataManager - Training with 402 variants after standard deviation thresholding. WARN 17:17:59,539 VariantDataManager - WARNING: Training with very few variant sites! Please check the model reporting PDF to ensure the quality of the model is reliable. INFO 17:17:59,542 GaussianMixtureModel - Initializing model with 100 k-means iterations... INFO 17:17:59,658 VariantRecalibratorEngine - Finished iteration 0. INFO 17:17:59,715 VariantRecalibratorEngine - Finished iteration 5. Current change in mixture coefficients = 0.20178 INFO 17:17:59,726 VariantRecalibratorEngine - Finished iteration 10. Current change in mixture coefficients = 0.05225 INFO 17:17:59,736 VariantRecalibratorEngine - Finished iteration 15. Current change in mixture coefficients = 0.05773 INFO 17:17:59,746 VariantRecalibratorEngine - Finished iteration 20. Current change in mixture coefficients = 0.03079 INFO 17:17:59,755 VariantRecalibratorEngine - Finished iteration 25. Current change in mixture coefficients = 0.05754 INFO 17:17:59,765 VariantRecalibratorEngine - Finished iteration 30. Current change in mixture coefficients = 0.09399 INFO 17:17:59,774 VariantRecalibratorEngine - Finished iteration 35. Current change in mixture coefficients = 0.19735 INFO 17:17:59,784 VariantRecalibratorEngine - Finished iteration 40. Current change in mixture coefficients = 0.00727 INFO 17:17:59,793 VariantRecalibratorEngine - Finished iteration 45. Current change in mixture coefficients = 0.00804 INFO 17:17:59,803 VariantRecalibratorEngine - Finished iteration 50. Current change in mixture coefficients = 0.00220 INFO 17:17:59,805 VariantRecalibratorEngine - Convergence after 51 iterations! INFO 17:17:59,815 VariantDataManager - Training with worst 0 scoring variants --> variants with LOD <= -5.0000.

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NullPointerException at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:83) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:359) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:139) at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:116) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.8-1-g932cd3a):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

Is GATK suitable for target re-sequencing data analysis(snp calling && indel calling)?

Could you please help me out? Thanks!

zhoujj


Return to top Comment on this article in the forum