Hello, We are using some custom made and predefined Haloplex kits. I was wandering how the best practice for variant detection should be "adapted". One of the biggest challenges we are facing is that we can not do a VQSR due to the low number of variants detected. So we have to use an hard filtering step, but here again the nature of the reads, all produced by enzyme restrictions, make some filters inappropriate like the ReadPosRankSumTest as the reads are not randomly produced. I was wondering if the community has any experience with this kind of data and how the hard filtering should be made? Thanks for your help. yvan
Dear developers, I have looked into the forum for similar questions but I couldn't find any. I have several cases in which I get homozygous calls in positions with ~50% of reads calling the mutation (or less), please find here an example of a position validated by Sanger (as het) in which I have a high coverage (~400 reads in total) here is the results with UnifiedGenotyper:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT L1 chr4 998101 . C T 7296.77 . AC=2;AF=1.00;AN=2;BaseQRankSum=16.344;DP=411;Dels=0.00;FS=0.000;HaplotypeScore=11.8258;MLEAC=2;MLEAF=1.00;MQ=59.94;MQ0=0;MQRankSum=-1.436;QD=17.75;ReadPosRankSum=-0.062 GT:AD:DP:GQ:PL 1/1:203,208:411:99:7325,581,0
can you help me in this case? I am really puzzled.
I have run it with v2.2, 2.4 and 2.5 and I always had the same genotype call (the excerpt here is from the 2.5, I have downloaded it just to check it was not caused by a bug already fixed). It's not a downsampling issue since I have high coverage samples (HaloPlex) and used higher dcov than the default.