VQSR
Posted in Ask the GATK team | Last updated on


Comments (5)

Hi,

I am working on dog genome and trying to use VQSR on my data.

Here is the command i have used:

java -Xmx4G -jar GenomeAnalysisTK.jar -R genome.fa -T VariantRecalibrator -input GATK-snp.vcf -resource:dbsnp,known=false,training=true,truth=true,prior=6.0 canFam3_SNP.vcf -mode SNP -recalFile output.recal -tranchesFile output.tranches -rscriptFile output.plots.R -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an Inbreed

  1. I have only dbSNP file as training set and i have set the options, known=true,training=false,truth=false,prior=6.0 in the command line as per the documentation. But that doesn't work and instead suggested to use known=false,training=true,truth=true,prior=6.0. What is the prior =6.0 here? is there any threshold for prior?

2.The above command produces empty tranches and recal file.

3.Even though the files are empty i have proceeded to ApplyRecalibration with the below command:

java -Xmx4G -jar GenomeAnalysisTK.jar -R genome.fa -T ApplyRecalibration -input GATK-snp.vcf --ts_filter_level 99.0 -tranchesFile output.tranches -recalFile output.recal -mode SNP -o recalibrated.filtered.vcf.

It gives the error:

ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:

ERROR Name FeatureType Documentation
ERROR BCF2 VariantContext http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_variant_bcf2_BCF2Codec.html
ERROR VCF VariantContext http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_variant_vcf_VCFCodec.html
ERROR VCF3 VariantContext http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_variant_vcf_VCF3Codec.html
ERROR

Any help to fix these?


Return to top Comment on this article in the forum