Latest posts

Workshop presentations -- 2015 BroadE 3/19 Mar 19

The slide from the 2015 BroadE GATK Best Practices workshop presentations are accessible at this Dropbox link.

The video recordings of the workshop talks will be online in a few weeks. We'll post links to the videos (along with copies of the corresponding slides) in the Presentations section of the Guide. They will also be available on the Broad's YouTube and iTunesU channels.

Consolidating GATK + Picard tools support Mar 9

Here's some good news for anyone who has been using both GATK and the Picard tools in their work -- which means all of you, since you all follow our Best Practices to a tee, right?

As you may know, both toolkits are developed here at the Broad Institute, and are deployed together in the Broad's analysis pipelines. The fact that they have been developed, released and supported separately so far is more an accident of history and internal organization than anything else (and we know it's inconvenient to y'all).

The good news is that we're taking steps to consolidate these efforts, which we believe will benefit everyone. In that spirit, we have been working closely with the Picard tools development team, and we're now ready to take the first step of consolidating support for the tools. From now on, you will be able to ask us questions about the Picard tools, and report bugs, in the GATK forum. And developers will be happy to hear that we are also committed to supporting HTSJDK for developers through the Github repo’s Issues tracker.

In the near future, we will also start hosting downloads and documentation for the Picard tools on the GATK website. And before you ask, yes, the Picard tools will continue to be completely open-source and available to all free of charge.

To recap, we have brought the GATK and Picard teams together, and we are working on bringing together in the same place all the methods and tools to perform genome analysis. Our goal is to make a world where you can run our complete Best Practices pipeline end-to-end with a single Broad toolkit. We think it’ll make your life easier, because it sure is making ours easier.

Registration open for GATK workshop (March 19-20) Feb 23

Registration is now LIVE for our upcoming BroadE Workshop: Best Practices for Variant Calling with the GATK.

WHEN: Thursday, March 19 & Friday, March 20, 2015

10:00 AM - 5:00 PM (Lecture, March 19)
2:00PM - 5:00 PM (Optional Tutorial, March 20)

WHERE: Broad Institute

Auditorium (lecture)/Yellowstone (Tutorial)
415 Main Street
Cambridge, Massachusetts 02142

Registration Schedule:

Registration closes February 27 at 5:00 PM.
Notification of acceptance or wait list status sent by March 4.



Snowpocalypse, Part Eleventy Feb 11

In case you were wondering why responses on the forum have been slow... We've been dealing with this crap.

Fig. 1: GATK support technician reassigned to fire hydrant clearing duty, monitoring the forum while waiting enthusiastically for the next round.

Remember kids, keeping fire hydrants clear of snow saves lives. Also, the plow truck is both ally and enemy in this fight. Who do you think keeps piling snow on the poor defenseless hydrant at the end of the street? Hate the plow truck. Love the plow truck. I'm so confused.

GATK in the UK : April 20-24 workshops Feb 11

We're going to be doing two back-to-back workshops in Edinburgh and Cambridge (the original, accept no substitutes) later this Spring, on April 20-21 and 23-24 respectively. The workshop program for both will be our typical one-day Best Practices lectures marathon followed by a half-day of lectures on supplemental topics (QC, non-humans, etc) and a half-day hands-on sessions for beginners to get their hands dirty with some real data.

These workshops are organized locally by the inviting institutions, University of Edinburgh and University of Cambridge, so please see the respective links for registration information.

Cheers to our hosts and we hope to see lots of you there!

bonus points to whoever gets the title reference -- and sings it in the correct tune

Surviving the Snowpocalypse Jan 27

So you may have heard the US Northeast is getting a little bit of snow. Here's what it looked like this morning at GATK Support HQ, during a relative lull that allowed me to do a first round of clearing:

Not too bad so far, but it looks like it's going to get worse before it gets better. Round two is going to suck.

Anyway, the Broad is shut down for the duration of the state of emergency, and we are all at home waiting out the snowpocalypse. The GATK forum will be mostly unattended while we hunker down and sip hot cocoa with marshmallows. Assuming the power stays on and we're able to dig ourselves out of the snow when it's all over, normal service should resume by end of day Wednesday.

Happy New Year 2015 and welcome back to the GATK forum Jan 5

It's a shiny New Year and the forum, like the rest of Broad, is back to active status, so bring it on! It might take us a day or two to mop up the questions that came in during the break so we appreciate your patience as always (although thanks to superuser @pdexheimer there's a bunch that are already resolved, yay).

In the next few days we'll hopefully have some hot new announcements for you, so please keep on eye on this space.

Happy holidays and see you next year! Dec 23

Despite the conspicuous lack of snow in the Boston area, my calendar insists that it's time for our annual winter/holiday break, which sees (almost) the entire Broad shut down from December 24th to January 2nd (included). So, starting tonight (5 pm EST), the forum will be unattended until we resume normal service on Monday January 5th of the new year, 2015.

It's been a busy end of year and we didn't get to quite the end of our service queue, so I apologize to those of you who may still have pending questions or requests. The amount of people using our tools -- and the diversity of applications they're being used for (hello bizarre non-model organisms) -- just keep increasing, which is a source of great joy for us but also of significant challenges. We've got some changes coming that should help us meet that ever-growing demand more efficiently in future. In the meantime though, we're really grateful for the patience and trust that you have all been showing us as we figure out our way through this fairly new territory.

We've also got some big changes coming in terms of the GATK itself, so please stay tuned for an announcement (or series of announcements) early in the new year. It's hard not to spill the beans so I'm not even going to drop any hints, but I can tell you it's going to be really exciting ;)

With that, I'll leave you to your holiday preparations, because you couldn't possibly still be working, right? Why are you even reading this? Shoo, go away! Go make a snowman, play with your kids, call your folks to say hi, or something like that. We'll see you next year!

P.S. If you live in a part of the world where it's not a holiday season, um, sorry? But hey, the "go play with your kids/ call your folks to say hi" exhortation is applicable at any time of year, really.

Best Practices VQSR parameters updated Dec 16

The Best Practices recommendations for Variant Quality Score Recalibration have been slightly updated to use the new(ish) StrandOddsRatio (SOR) annotation, which complements FisherStrand (FS) as indicator of strand bias (only available in GATK version 3.3-0 and above).

While we were at it we also reconciled some inconsistencies between the tutorial and the FAQ document. As a reminder, if you ever find differences between parameters given in the VQSR docs, let us know, but FYI that the FAQ is the ultimate source of truth=true. Note also that the command line example given in VariantRecalibrator tool doc tends to be out of date because it can only be updated with the next release (due to a limitation of the tool doc generation system) and, well, we often forget to do it in time -- so it should never be used as a reference for Best Practice parameter values, as indicated in the caveat right underneath it which no one ever reads.

Speaking of caveats, there's no such thing as too much repetition of the fact that whole genomes and exomes have subtle differences that require some tweaks to your command lines. In the case of VQSR, that means dropping Coverage (DP) from your VQSR command lines if you're working with exomes.

Finally, keep in mind that the values we recommend for tranches are really just examples; if there's one setting you should freely experiment with, that's the one. You can specify as many tranche cuts as you want to get really fine resolution.

Let's play GATK Survivor: what tools will get voted off the island? Dec 6

Warning: the following content may shock or distress our more sensitive users, as we discuss the cold-blooded elimination of some tools from the GATK.

Alright, now that I've got your attention (hopefully -- if not, what does it take?), here's the deal. We have got to a point where the GATK is a widely, even massively used toolkit (thanks to you, dear users). And it's pretty darn robust -- it's what the Broad's Genomic Platform uses in production to churn out exomes like there's no tomorrow. But it has technical limitations that are 1) a frequent source of pain on your end and 2) increasingly hampering development of new methods on our end.

The good news is that we have a plan for addressing (read: blasting away) these limitations. But part of this plan will involve streamlining GATK by getting rid of tools that are not useful or are inferior to alternative tools from other packages that we're not trying to compete with (e.g. Picard tools).

Some tools that are safe from elimination: all the tools used in the Best Practices, and a couple of utilities that we use a lot ourselves. But everything else is up for review -- and that's where you come in: we need your input to decide what to keep, what to throw away, and what to consider rewriting from scratch (yep, this is an option).

This link will take you to a SurveyMonkey page that lists the tools currently on the chopping block:

Act now to save your favorite non-BP tools! Or help us get rid of the crud. Whichever way you want to look at it, we appreciate your feedback!

Latest posts

Search blog by tag

2013 ad agbt14 appistry ashg ashg2014 baserecalibrator belgium best-practices beta blog brussels bug bug-fixed cancer catvariants challenge combinegvcfs combinevariants commandline commandlinegatk commercial compbio competition conferences denovo depthofcoverage diagnosetargets downtime error fastaalternatereferencemaker fix forum gatk gatk-3-0 gatk-3-2 gatk-lite gatk3 genotype genotypegvcfs genotyperefinement gsa gvcf haploid haplotypecaller hardware holiday htsjdk ibm indelrealigner job job-offer jobs joint-analysis joint-discovery key license lite media meetings mendelianviolations multisample multithreading mutect nt pairhmm paper performance phone-home picard pipeline ploidy polyploid poster presentations press printreads queue randomlysplitvariants readbackedphasing reducereads reference-model release release-notes rnaseq search selectvariants service slides snow spam speed splitncigarreads status sting support syntax talks team third-party-tools topstory trivia troll unifiedgenotyper userstories validatevariants variantannotator variantrecalibrator variantstobinaryped version-highlights versions video videos vqsr webinar workshop

GATK Dev Team


RT @githubstatus: After 113 hours of sustained DDoS attacks our defenses are holding. We will keep our status at yellow until the threat ha…
31 Mar 15
Found out today #GATK forum is @vanillaforums poster-child for Q&A features... Can we get some swag out of it?
30 Mar 15
@aaronquinlan @brent_p Yup.
25 Mar 15
@aaronquinlan @brent_p Ah, yes, completely agree. Feel free to voice this to
25 Mar 15
@aaronquinlan @brent_p Do you mean content (filtered or not) or format (tag name)?
25 Mar 15