VQSR plots
Posted in Ask the GATK team | Last updated on 2014-02-26 15:35:08


Comments (14)

hi i run VQSR on the vcf file generated by unified genotyper and filtered PASS 63412 out of 86840 (files with snps and indels). as i run unified genotyper with -glm BOTH command. i have two questions

1) the number of pass snps are different when i counted them in two ways(first with original output of UG and other by separating snps and indel into two separate files using awk script

grep -v "#" sample1_recalibrated_snps_PASS.vcf | grep -c "PASS"
63412
grep -v "#" sample1_merged_recalibrated_snps_raw_indels.vcf| grep -c "LowQual“
18725

Statistics for separate snp file. here i use awk script to separate snps and indels (using awk script)

Rest is fine only problem is that pass snps no differ think why

grep -v  "^#" sample1_snp.vcf| grep -c "PASS
63402
grep -v  "^#" sample1_snp.vcf| grep -c "LowQual“
18725

2) i run VQSR on snps generated by unified genotyper i need to ask query about VQSR tranche plot for Snps. in my case tranche is not showing any false positive call see plot attached what do i interpret that there is no FP which seems surprising

when i tried to run VQSR on INDELS (in the same file) it doesnt work as i had 884 indels which i read from VQSR documentation and questions asked by ppl is small.


Return to top Comment on this article in the forum