Please be advised that the forum will be unattended for the duration of the Thanksgiving holiday, 26-29 Nov. During that time, we'll only answer questions if we need an excuse to escape from in-laws or screaming babies (which is worse is a matter of some debate).
Regular service will resume Monday 30 Nov.
Thanksgiving message below the fold.
The last GATK 3.x release of the year 2015 has arrived!
The major feature in GATK 3.5 is the eagerly awaited MuTect2 (beta version), which brings somatic SNP and Indel calling to GATK. This is just the beginning of GATK’s scope expansion into the somatic variant domain, so expect some exciting news about copy number variation in the next few weeks! Meanwhile, more on MuTect2 awesomeness below.
In addition, we’ve got all sorts of variant context annotation-related treats for you in the 3.5 goodie bag -- both new annotations and new capabilities for existing annotations, listed below.
In the variant manipulation space, we enhanced or fixed functionality in several tools including LeftAlignAndTrimVariants, FastaAlternateReferenceMaker and VariantEval modules. And in the variant calling/genotyping space, we’ve made some performance improvements across the board to HaplotypeCaller and GenotypeGVCFs (mostly by cutting out crud and making the code more efficient) including a few improvements specifically for haploids. Read the detailed release notes for more on these changes. Note that GenotypeGVCFs will now emit no-calls at sites where RGQ=0 in acknowledgment of the fact that those sites are essentially uncallable.
We’ve got good news for you if you’re the type who worries about disk space (whether by temperament or by necessity): we finally have CRAM support -- and some recommendations for keeping the output of BQSR down to reasonable file sizes, detailed below.
Finally, be sure to check out the detailed release notes for the usual variety show of minor features (including a new Queue job runner that enables local parallelism), bug fixes and deprecation notices (a few tools have been removed from the codebase, in the spirit of slimming down ahead of the holiday season).
GATK 3.5 was released on November 25, 2015. Itemized changes are listed below. For more details, see the user-friendly version highlights.
The slide decks presented on Day 1 of the BroadE GATK workshop on November 18 are now available at this Google Drive link:
The workshop handout document (agenda and talk abstracts) is available at this link:
We are scheduled to do 2 hands-on modules at the BroadE GATK workshop at the Broad Institute this Thursday, Nov. 19.
The tutorial materials for each module include the following:
Attendees will receive a printout of the worksheets for the module to which they have registered. We will not provide printouts of the appendix documents.
If you are registered to attend, you must have downloaded the materials and followed the instructions before the workshop starts, otherwise you will not be able to follow along and your workshop experience will be unsatisfying. We certainly don't want that to happen, so be sure to do your homework as follows below the fold. Be sure to identify correctly which workshop you are registered for!
As announced recently, we are doing a workshop at the Broad Institute in Cambridge this coming November 18-19 (description below the fold).
Registration for the workshop is now live here.
Registration will close Oct. 30 (this Friday) so don't wait to sign up!
Yay, it's workshop planning season again!
Another local BroadE workshop at the Broad Institute for our Boston Metro Area peeps (and anyone who cares to travel to us on short notice). This will be the last workshop of 2015 and the first to cover the upcoming GATK version 3.5 -- which we really have to release soon now! As usual, the topic will be GATK Best Practices, with all the latest updates, some fresh material including two somatic variant discovery talks, and two completely revamped hands-on tutorial sessions. Registration will open on Monday, October 26 and stay open for one week. We'll post a link to the registration site when it goes live.
More below the fold: our preliminary workshop lineup for the 2016 Winter & Spring season.
Geraldine Van der Auwera presented this talk as part of the Broad Institute's Medical and Population Genetics (MPG) Primers series.
This talk provides a high-level overview of the workflow for performing variant discovery on high-throughput sequencing data, as described in the GATK Best Practices and implemented in the Broad's production pipelines.
The following points emphasized in this presentation are:
The presentation slide deck is available at this link.
Great poster session this morning at ASHG; Sheila and I got a lot of good questions about the Best Practices, and the GVCF workflow in particular. Our punchline: "This is how ExAC got done". It's super effective!
Preview of the poster after the fold. You can get the full-sized PDF here.
Are you excited? We sure are. Especially in the secondary sense defined by dictionary.reference.com as "stimulated to activity; brisk:". We're presenting a poster on Thursday and a 90-minute workshop on Friday, but neither is ready yet. Good thing the weather this weekend is crappy; if we were missing out on proper New England fall foliage / leaf-peeping weather we'd be pretty cheesed off.
But we ain't afraid of no deadline -- we'll be ready. We developed a completely new workshop tutorial for the occasion, and we're going to have a big room full of people rocking GVCFs. It's going to be epic. The tutorial data bundle, sans worksheet (because that's the part that's not quite ready yet) is already available here (not a direct link to the data because we want you to read the part about the homework). It does have an appendix document with installation instructions and some context info about the tutorial objectives, which you must read through (and act on) before the workshop, if you're attending.
If you're at ASHG but you can't make it to the workshop, you can still come see our poster, which covers the same topic (the GVCF workflow part of the Best Practices), but flatter and less interactive. Although Sheila and I will be there to answer questions one on one, so in that sense it will be more interactive. Just with less keyboard action. So, Thursday 9 Oct between 12 and 1 pm at the Bioinformatics and Genomic Technology session in the Exhibit Hall, Level 1; Convention Center, poster #1664/T. Be there. We'll talk.
We'll also be around at the Broad Institute Genomic Services booth in the Exhibit Hall (booth #1720, right around the corner from Qiagen). Not sure yet when we'll be there, but send me a private message if you'd like to chat and we can figure out a time.
See you there!