m struggeling with some statistics given by the vcf file: the Ranksumtests. I started googleing arround, but that turned out to be not helpfult for understanding it (in may case). I really have no idea how to interprete the vcf-statistic-values comming from ranksumtest. I have no clue whether a negative, positive or value near zero is good/bad. Therefore im asking for some help here. Maybe someone knows a good tutorial-page or can give me a hint to better understand the values of MQRankSum, ReadPosRankSum and BaseQRankSum. I have the same problem with the FisherStrand statistics. Many, many thanks in advance.
In some of my calls, I have ID=BaseQRankSum, whereas in others, it isn't reported. I'm wondering what makes the unified genotyper call this for some SNPs but not for others. Also, in the documentation, it says this test cannot be done on homozygous sites: "bases of the alternate allele). Note that the base quality rank sum test can not be calculated for homozygous sites."
But this value is present at 1/1 sites for the GT field, which I understand it to be homozygous calls? e.g.
BaseQRankSum not called: chr1 5503270 . G A 1023.62 . AC=2;AF=1.00;AN=2;DP=29;Dels=0.00;FS=0.000;HRun=0;HaplotypeScore=0.0000;MQ=56.06;MQ0=0;QD=35.30;SB=-375.13 GT:AD:DP:GQ:PL 1/1:0,29:29:87.24:1057,87,0
BaseQRankSum called: chr1 6577241 . G A 1359.86 . AC=2;AF=1.00;AN=2;BaseQRankSum=2.312;DP=43;Dels=0.00;FS=0.000;HRun=1;HaplotypeScore=0.0000;MQ=59.13;MQ0=0;MQRankSum=1.692;QD=31.62;ReadPosRankSum=0.787;SB=-641.06 GT:AD:DP:GQ:PL 1/1:3,40:43:34.37:1393,34,0
Can someone help interpret this?
Many thanks, Ken