I have processed 10 whole-exome samples using the GATK best practices workflow (GATK v2.4-3-g2a7af43). I am currently evaluating my variant call set (generated from HaplotypeCaller) with OMNI 2.5 SNP array (comparison set) and dbSNP 137.
I have included 2 rows from the Ti/Tv Variant Evaluator table:
CompRod EvalRod Novelty Sample nTi nTv tiTvRatio nTiInComp nTvInComp TiTvRatioStandard
OMNI MyCalls all all 79945 30322 2.64 993588 274219 3.62
dbsnp MyCalls all all 79945 30322 2.64 30214009 15253850 1.98
According to literature survey, the Ti/Tv ratio should be approximately 2.1 for whole genome sequencing and 2.8 for whole exome sequencing. Since I am getting Ti/Tv of 2.64 for exome, does this indicate false positives in the data? Also, what could be the rationale for getting such high TiTvRatioStandard for the OMNI whole genome data?