Tagged with #versions
1 documentation article | 1 announcement | 0 forum discussions


Comments (0)

Short answer: NO.

Medium answer: no, at least not if you want to run a low-risk pipeline.

Long answer: see below for details.


The rationale

There are several reasons why you might want to do this: you're using the latest version of GATK and one of the tools has a show-stopping bug, so you'd like to use an older, pre-bug version of that tool, but still use the latest version of all the other tools; or maybe you've been using an older version of GATK and you'd like to use a new tool, but keep using the rest in the version that you've been using to process hundreds of samples already.

The problem: compatibility is not guaranteed

In many cases, when we modify one tool in the GATK, we need to make adjustments to other tools that interact either directly or indirectly with the data consumed or produced by the upgraded tool. If you mix and match tools from different versions of GATK, you risk running into compatibility issues. For example, HaplotypeCaller expects a BAM compressed by Reduce Reads to have its data annotated in a certain way. If the information is formatted differently than what the HC expects (because that's how the corresponding RR from the same version does it), it can blow up -- or worse, do the wrong thing but not tell you there's a problem.

But what if the tools/tasks are in unrelated workflows?

Would it really be so bad to use CountReads from GATK version 2.7 for a quick QC check that's not actually part of my pipeline, which uses version 2.5? Well, maaaaybe not, but we still think it's a source of error, and we do our damnedest to eliminate those.

The conclusion

You shouldn't use tools from different versions within the same workflow, that's for sure. We don't think it's worth the risks. If there's a show-stopping bug, let us know and we promise to fix it as soon as (humanly) possible. For the rest, either accept that you're stuck with the version you started your study with (we may be able to help with workarounds for known issues), or upgrade your entire workflow and start your analysis from scratch. Depending on how far along you are one of those options will be less painful to you; go with that.

The plea bargain, and a warning

If despite our dire warnings you're still going to mix and match tool versions, fine, we can't stop you. But be really careful, and check every version release notes document ever. And keep in mind that when things go wrong, we will deny you support if we think you've been reckless.

Comments (0)

Our partners at Appistry are doing another free live webinar tomorrow (Nov 7), this time focusing on the differences between versions of GATK. The ultimate point is, of course, to convince any stragglers to upgrade to the latest and greatest major version series (GATK 3.x for those of you following at home) but unlike us, they will actually take the time to explain in detail why this is a winning proposition.

If you have any questions about the differences between versions, be sure to register today and tune in tomorrow, Thursday 7 November at 12 pm EDT. Anyone can join (not just Appistry customers) so don't miss this great learning opportunity. Every question will be answered, either during the live event or by email afterward if they run out of time.

No posts found with the requested search criteria.