VQSR error: NaN LOD value assigned
Posted in Ask the GATK team | Last updated on 2014-02-26 16:17:03


Comments (1)

INFO  17:05:50,124 GenomeAnalysisEngine - Preparing for traversal 
INFO  17:05:50,144 GenomeAnalysisEngine - Done preparing for traversal 
INFO  17:05:50,144 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
INFO  17:05:50,145 ProgressMeter -        Location processed.sites  runtime per.1M.sites completed total.runtime remaining 
INFO  17:05:50,166 TrainingSet - Found hapmap track:    Known = false   Training = true     Truth = true    Prior = Q15.0 
INFO  17:05:50,166 TrainingSet - Found omni track:  Known = false   Training = true     Truth = false   Prior = Q12.0 
INFO  17:05:50,167 TrainingSet - Found dbsnp track:     Known = true    Training = false    Truth = false   Prior = Q6.0 
INFO  17:06:20,149 ProgressMeter -     1:216404576        2.04e+06   30.0 s       14.0 s      7.0%         7.2 m     6.7 m 
INFO  17:06:50,151 ProgressMeter -     2:223579089        4.70e+06   60.0 s       12.0 s     15.2%         6.6 m     5.6 m 
INFO  17:07:20,159 ProgressMeter -      4:33091662        7.43e+06   90.0 s       12.0 s     23.3%         6.4 m     4.9 m 
INFO  17:07:50,161 ProgressMeter -      5:92527959        1.00e+07  120.0 s       11.0 s     31.4%         6.4 m     4.4 m 
INFO  17:08:20,162 ProgressMeter -       7:1649969        1.30e+07    2.5 m       11.0 s     39.8%         6.3 m     3.8 m 
INFO  17:08:50,168 ProgressMeter -     8:106975025        1.58e+07    3.0 m       11.0 s     48.4%         6.2 m     3.2 m 
INFO  17:09:20,169 ProgressMeter -    10:101433561        1.87e+07    3.5 m       11.0 s     57.4%         6.1 m     2.6 m 
INFO  17:09:50,170 ProgressMeter -     12:99334147        2.16e+07    4.0 m       11.0 s     66.1%         6.1 m     2.1 m 
INFO  17:10:20,171 ProgressMeter -     15:30577012        2.41e+07    4.5 m       11.0 s     75.4%         6.0 m    88.0 s 
INFO  17:10:52,409 ProgressMeter -      18:8763648        2.68e+07    5.0 m       11.0 s     83.5%         6.0 m    59.0 s 
INFO  17:11:22,410 ProgressMeter -     22:31598896        2.97e+07    5.5 m       11.0 s     92.2%         6.0 m    27.0 s 
INFO  17:11:33,135 VariantDataManager - QD:      mean = 17.48    standard deviation = 9.03 
INFO  17:11:33,516 VariantDataManager - HaplotypeScore:      mean = 3.03     standard deviation = 2.62 
INFO  17:11:33,882 VariantDataManager - MQ:      mean = 52.40    standard deviation = 2.98 
INFO  17:11:34,253 VariantDataManager - MQRankSum:   mean = 0.31     standard deviation = 1.02 
INFO  17:11:37,973 VariantDataManager - Training with 1024360 variants after standard deviation thresholding. 
INFO  17:11:37,977 GaussianMixtureModel - Initializing model with 30 k-means iterations... 
INFO  17:11:53,065 ProgressMeter - GL000202.1:10465        3.08e+07    6.0 m       11.0 s     99.8%         6.0 m     0.0 s 
INFO  17:12:09,041 VariantRecalibratorEngine - Finished iteration 0. 
INFO  17:12:23,066 ProgressMeter - GL000202.1:10465        3.08e+07    6.5 m       12.0 s     99.8%         6.5 m     0.0 s 
INFO  17:12:30,492 VariantRecalibratorEngine - Finished iteration 5.    Current change in mixture coefficients = 0.08178 
INFO  17:12:51,054 VariantRecalibratorEngine - Finished iteration 10.   Current change in mixture coefficients = 0.05869 
INFO  17:12:53,072 ProgressMeter - GL000202.1:10465        3.08e+07    7.0 m       13.0 s     99.8%         7.0 m     0.0 s 
INFO  17:13:11,207 VariantRecalibratorEngine - Finished iteration 15.   Current change in mixture coefficients = 0.15237 
INFO  17:13:23,073 ProgressMeter - GL000202.1:10465        3.08e+07    7.5 m       14.0 s     99.8%         7.5 m     0.0 s 
INFO  17:13:31,503 VariantRecalibratorEngine - Finished iteration 20.   Current change in mixture coefficients = 0.13505 
INFO  17:13:51,768 VariantRecalibratorEngine - Finished iteration 25.   Current change in mixture coefficients = 0.05729 
INFO  17:13:53,080 ProgressMeter - GL000202.1:10465        3.08e+07    8.0 m       15.0 s     99.8%         8.0 m     0.0 s 
INFO  17:14:11,372 VariantRecalibratorEngine - Finished iteration 30.   Current change in mixture coefficients = 0.02607 
INFO  17:14:23,081 ProgressMeter - GL000202.1:10465        3.08e+07    8.5 m       16.0 s     99.8%         8.5 m     0.0 s 
INFO  17:14:24,730 VariantRecalibratorEngine - Convergence after 33 iterations! 
INFO  17:14:27,037 VariantRecalibratorEngine - Evaluating full set of 3860460 variants... 
INFO  17:14:51,111 VariantDataManager - Found 0 variants overlapping bad sites training tracks. 
INFO  17:14:55,071 VariantDataManager - Additionally training with worst 1000 scoring variants --> 1000 variants with LOD <= -30.5662. 
INFO  17:14:55,071 GaussianMixtureModel - Initializing model with 30 k-means iterations... 
INFO  17:14:55,082 VariantRecalibratorEngine - Finished iteration 0. 
INFO  17:14:55,095 VariantRecalibratorEngine - Convergence after 4 iterations! 
INFO  17:14:55,096 VariantRecalibratorEngine - Evaluating full set of 3860460 variants... 
INFO  17:15:02,071 GATKRunReport - Uploaded run statistics report to AWS S3 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 2.7-2-g6bda569): 
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online documentation guide
##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: NaN LOD value assigned. Clustering with this few variants and these annotations is unsafe. Please consider raising the number of variants used to train the negative model (via --numBad 3000, for example).
##### ERROR ------------------------------------------------------------------------------------------

My command is :

java -jar -Xmx4g GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar -T VariantRecalibrator -R human_g1k_v37.fasta -input NA12878_snp.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.vcf -resource:omni,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.sites.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=6.0 dbsnp_132.b37.vcf -an QD -an HaplotypeScore -an MQ -an MQRankSum --maxGaussians 4 -mode SNP -recalFile NA12878_recal.vcf -tranchesFile NA12878_tranches -rscriptFile NA12878.plots.R

Before I didn't use -maxGaussians 4, once an error suggested this, I tried but still got this error message...And I think that numBad is already deprecated. I don't understand why this error will happen. I'm doing GATK unifiedgenotyper on 1000Genomes high coverage bam file and then use VQSR to filter the snp.


Return to top Comment on this article in the forum