UG/HC interval list running start
Posted in Ask the GATK team | Last updated on


Comments (1)

GATK Team:

This comment came up after some discussion with a colleague of mine:

"If you split the human genome into 100 pieces, we have to create overlapping regions so that GATK won't miss variants, but this creates a complicated situation where you may have to merge variants at the same locus."

Is it true that I would have to pad intervals and explicitly resolve variants (if called at the same locus)? If I use -L target_intervals specifying non-overlapping intervals, does GATK get a "running start" (say 50bp upstream to get variant context) before emitting variants--as samtools mpileup/bcftools claims to--or does GATK jump in directly at the start of the specified interval (and may not then call variants within some short starting interval)?

Your clarification would be much appreciated.


Return to top Comment on this article in the forum