Our very own Mauricio Carneiro (@Carneiro on the forum) recently gave an interview in which he explained the origins of the GATK and the motivations of the development team. You can watch it here: http://video.cmdagency.com/bio
We have a problem: we have a truckload of material sitting around waiting to be published, but no time to actually write the papers. So we're looking for someone who will help us convert this computational biology goldmine into cold hard Nature Biotech/Methods papers.
This is a great opportunity for an early-career, postdoc-level scientist who has experience publishing papers, demonstrated writing ability, and is not afraid of wrangling complex technical material.
Make no mistake, we're not looking for a ghostwriter; this will involve intellectual contribution worth the authorship in high-profile publications. But the basic material is ready and waiting.
Here is the complete job description; feel free to ask questions in the comments or by private message to me (Geraldine):
This is not exactly new (it was fixed in GATK 3.0) but it's come to our attention that many people are unaware of this bug, so we want to spread the word since it might have some important impacts on people's results.
Affected versions: 2.x versions up to 2.8 (not sure when it started)
Affected tool: SelectVariants
Trigger conditions: Extracting a subset of samples with SelectVariants while using multi-threading (
Effects: Genotype-level fields (such as AD) swapped among samples
This bug no longer affects any tools in versions 3.0 and above, but callsets generated with earlier versions may need to be checked for consistency of genotype-level annotations. Our sincere apologies if you have been affected by this bug, and our thanks to the users who reported experiencing this issue.
The slides from today's webinar are available as of now in the GSA team Dropbox at this link, and will be on the documentation website shortly.
Our partners at Appistry are putting on another webinar next week, and this one's going to be pretty special in our view -- because we're going to be doing pretty much all the talking!
Titled "Speed, Cohorts, and RNAseq: An Insider Look into GATK 3" (see that link for the full program), this webinar will be all about the GATK 3 features, of course. And lest you think this is just another marketing pitch (no offense, marketing people), rest assured that we will be diving into the gory technical details of what happens under the hood. This is a great opportunity to get the inside scoop on how the new features (RNAseq, GVCF pipeline etc) work -- all the stuff that's fit to print, but that we haven't had time to write down in the docs yet. So don't miss it if that's the sort of thing that floats your boat! Or if you miss it, be sure to check out the recording afterward.
As usual the webinar is completely free and open to everyone (not just Appistry customers or prospective for-profit users). All you need to do is register now and tune in on Thursday 4/10.
Talk to you then!
I'm very happy to introduce Sheila Chandran, our newest GSA team member, who will be helping me with GATK outreach, support and documentation. You can expect to see Sheila start answering questions on the GATK forum within a week or two!
Thanks to Sheila's help, I'll be able to expand our support model to the Broad Cancer Tools (MuTect and related). Moving forward, we'll produce documentation for MuTect and other tools produced by the Cancer Group in collaboration with the developers, in order to bring you the same level of documentation coverage and support that we currently have for GATK.
Rest assured however that we won't stop working on improving the GATK documentation. In fact, Sheila's first project with us will be to document how the HaplotypeCaller works in detail -- something I know many of you have been hoping to see for a while now!
As a final note, I'd like to mention that this is one of many positive outcomes from our collaboration with our commercial licensing partner, Appistry, so we'd like to express our thanks to them for helping us help you, our user community.
This may seem crazy considering we released the big 3.0 version not two weeks ago, but yes, we have a new version for you already! It's a bit of a special case because this release is all about the hardware-based optimizations we had previously announced. What we hadn't announced yet was that this is the fruit of a new collaboration with a team at Intel (which you can read more about here), so we were waiting for everyone to be ready for the big reveal.
We're very excited to announce that we have started collaborating with a team from Intel (yep, that Intel) to optimize key parts of the GATK code to make it run faster. The first fruits of this collaboration --a set of hardware-based optimizations for the PairHMM algorithm in HaplotypeCaller-- are available as of today in version 3.1 of the GATK. Please see the release notes and version highlights in the Version History section of the Guide for details.
Of course this is only the beginning, and we're looking forward to delivering more performance improvements for various other GATK tools moving forward as part of this collaboration.
What's really cool is that this collaboration extends beyond our little GATK team; the Intel Bio Team is also going to be working with other groups at the Broad Institute to make their software run faster as well, all with the goal of accelerating scientific research and discovery.
For more details and background information, see the Bio-IT World story here: http://www.bio-itworld.com/2014/3/20/broad-intel-announce-speed-improvements-gatk-powered-by-intel-optimizations.html
Better late than never, here is the now-traditional "Highlights" document for GATK version 3.0, which was released two weeks ago. It will be a very short one since we've already gone over the new features in detail in separate articles --but it's worth having a recap of everything in one place. So here goes.
GATK 3.1 was released on March 18, 2014. Highlights are listed below. Read the detailed version history overview here: http://www.broadinstitute.org/gatk/guide/version-history