GATK / UnifiedGenotyper -dcov parameter values
Posted in Ask the team | Last updated on 2012-11-19 20:07:33


Comments (12)

I ran the same sample through a pipeline using GATK twice and received different variants. I am trying to understand the reason behind this. My samples are from a MiSeq/capture kit run and downsampling could be one reason (given in one scenario that variant is called and in other it isn't) the variant is called at 32% when looked into the .bam files.

As I understand the UnifiedGenotyper downsamples my dataset randomly to 250, so I played around with -dcov parameter

  • same sample run twice, 1st run reports a variant; 2nd run doesn't.
  • up -dcov to 1000 neither run reports the variant.
  • up -dcov to 10,000 1st run again reports a variant; 2nd run doesn't.
  • set -dt NONE both runs call that variant

But setting -dt to NONE could be computationally exhaustive for a big sample set. Is there an identifiable reason to why this is happening..?

Curious..!


Return to top Comment on this article in the forum