reduceReads removes reads that should not be removed
Posted in Ask the GATK team | Last updated on

Comments (13)

I am having issues making sense of the behaviour of reducereads in an example where I know that the call is real (Sanger validated...). In the non-reduced pileup the variant is pretty solid:

11 47364249 G 12 aA..A.A,.A.^~.

The compressed pileup looks like: 11 47364249 G 2 .A

which I guess is OK but the depth associated with the A call is 1 instead of the expected 5 (as evidenced when I run a samtools view). Accordingly, after running the UG, I get 7 reads supporting the reference and 1 supporting the alternative. 0/1:7,1:8:15:15,0,203 But really it should be 7/5. I am losing 4 precious reads that turn this variant into missing.

I really can't make sense of it. I tweaked all the options I could find to keep as many reads as possible when reducing: -minqual 0 -minmap 0 --dont_hardclip_low_qual_tails -noclip_ad

I am using GenomeAnalysisTK-2.7-4, and I can upload the slide of the BAM file to illustrate the problem. I also attach my debugging code (as a txt file) if someone wants to see if I am missing a key option.

Is that a bug? Or am I missing something obvious?

Return to top Comment on this article in the forum