Latest posts
 

Registration open for GATK workshop (March 19-20) Feb 23

Registration is now LIVE for our upcoming BroadE Workshop: Best Practices for Variant Calling with the GATK.

WHEN: Thursday, March 19 & Friday, March 20, 2015

10:00 AM - 5:00 PM (Lecture, March 19)
2:00PM - 5:00 PM (Optional Tutorial, March 20)

WHERE: Broad Institute

Auditorium (lecture)/Yellowstone (Tutorial)
415 Main Street
Cambridge, Massachusetts 02142

Registration Schedule:

Registration closes February 27 at 5:00 PM.
Notification of acceptance or wait list status sent by March 4.

REGISTER HERE


Read more...

Snowpocalypse, Part Eleventy Feb 11

In case you were wondering why responses on the forum have been slow... We've been dealing with this crap.

Fig. 1: GATK support technician reassigned to fire hydrant clearing duty, monitoring the forum while waiting enthusiastically for the next round.

Remember kids, keeping fire hydrants clear of snow saves lives. Also, the plow truck is both ally and enemy in this fight. Who do you think keeps piling snow on the poor defenseless hydrant at the end of the street? Hate the plow truck. Love the plow truck. I'm so confused.


GATK in the UK : April 20-24 workshops Feb 11

We're going to be doing two back-to-back workshops in Edinburgh and Cambridge (the original, accept no substitutes) later this Spring, on April 20-21 and 23-24 respectively. The workshop program for both will be our typical one-day Best Practices lectures marathon followed by a half-day of lectures on supplemental topics (QC, non-humans, etc) and a half-day hands-on sessions for beginners to get their hands dirty with some real data.

These workshops are organized locally by the inviting institutions, University of Edinburgh and University of Cambridge, so please see the respective links for registration information.

Cheers to our hosts and we hope to see lots of you there!

bonus points to whoever gets the title reference -- and sings it in the correct tune


Surviving the Snowpocalypse Jan 27

So you may have heard the US Northeast is getting a little bit of snow. Here's what it looked like this morning at GATK Support HQ, during a relative lull that allowed me to do a first round of clearing:

Not too bad so far, but it looks like it's going to get worse before it gets better. Round two is going to suck.

Anyway, the Broad is shut down for the duration of the state of emergency, and we are all at home waiting out the snowpocalypse. The GATK forum will be mostly unattended while we hunker down and sip hot cocoa with marshmallows. Assuming the power stays on and we're able to dig ourselves out of the snow when it's all over, normal service should resume by end of day Wednesday.


Happy New Year 2015 and welcome back to the GATK forum Jan 5

It's a shiny New Year and the forum, like the rest of Broad, is back to active status, so bring it on! It might take us a day or two to mop up the questions that came in during the break so we appreciate your patience as always (although thanks to superuser @pdexheimer there's a bunch that are already resolved, yay).

In the next few days we'll hopefully have some hot new announcements for you, so please keep on eye on this space.


Happy holidays and see you next year! Dec 23

Despite the conspicuous lack of snow in the Boston area, my calendar insists that it's time for our annual winter/holiday break, which sees (almost) the entire Broad shut down from December 24th to January 2nd (included). So, starting tonight (5 pm EST), the forum will be unattended until we resume normal service on Monday January 5th of the new year, 2015.

It's been a busy end of year and we didn't get to quite the end of our service queue, so I apologize to those of you who may still have pending questions or requests. The amount of people using our tools -- and the diversity of applications they're being used for (hello bizarre non-model organisms) -- just keep increasing, which is a source of great joy for us but also of significant challenges. We've got some changes coming that should help us meet that ever-growing demand more efficiently in future. In the meantime though, we're really grateful for the patience and trust that you have all been showing us as we figure out our way through this fairly new territory.

We've also got some big changes coming in terms of the GATK itself, so please stay tuned for an announcement (or series of announcements) early in the new year. It's hard not to spill the beans so I'm not even going to drop any hints, but I can tell you it's going to be really exciting ;)

With that, I'll leave you to your holiday preparations, because you couldn't possibly still be working, right? Why are you even reading this? Shoo, go away! Go make a snowman, play with your kids, call your folks to say hi, or something like that. We'll see you next year!

P.S. If you live in a part of the world where it's not a holiday season, um, sorry? But hey, the "go play with your kids/ call your folks to say hi" exhortation is applicable at any time of year, really.


Best Practices VQSR parameters updated Dec 16

The Best Practices recommendations for Variant Quality Score Recalibration have been slightly updated to use the new(ish) StrandOddsRatio (SOR) annotation, which complements FisherStrand (FS) as indicator of strand bias (only available in GATK version 3.3-0 and above).

While we were at it we also reconciled some inconsistencies between the tutorial and the FAQ document. As a reminder, if you ever find differences between parameters given in the VQSR docs, let us know, but FYI that the FAQ is the ultimate source of truth=true. Note also that the command line example given in VariantRecalibrator tool doc tends to be out of date because it can only be updated with the next release (due to a limitation of the tool doc generation system) and, well, we often forget to do it in time -- so it should never be used as a reference for Best Practice parameter values, as indicated in the caveat right underneath it which no one ever reads.

Speaking of caveats, there's no such thing as too much repetition of the fact that whole genomes and exomes have subtle differences that require some tweaks to your command lines. In the case of VQSR, that means dropping Coverage (DP) from your VQSR command lines if you're working with exomes.

Finally, keep in mind that the values we recommend for tranches are really just examples; if there's one setting you should freely experiment with, that's the one. You can specify as many tranche cuts as you want to get really fine resolution.


Let's play GATK Survivor: what tools will get voted off the island? Dec 6

Warning: the following content may shock or distress our more sensitive users, as we discuss the cold-blooded elimination of some tools from the GATK.

Alright, now that I've got your attention (hopefully -- if not, what does it take?), here's the deal. We have got to a point where the GATK is a widely, even massively used toolkit (thanks to you, dear users). And it's pretty darn robust -- it's what the Broad's Genomic Platform uses in production to churn out exomes like there's no tomorrow. But it has technical limitations that are 1) a frequent source of pain on your end and 2) increasingly hampering development of new methods on our end.

The good news is that we have a plan for addressing (read: blasting away) these limitations. But part of this plan will involve streamlining GATK by getting rid of tools that are not useful or are inferior to alternative tools from other packages that we're not trying to compete with (e.g. Picard tools).

Some tools that are safe from elimination: all the tools used in the Best Practices, and a couple of utilities that we use a lot ourselves. But everything else is up for review -- and that's where you come in: we need your input to decide what to keep, what to throw away, and what to consider rewriting from scratch (yep, this is an option).

This link will take you to a SurveyMonkey page that lists the tools currently on the chopping block:

https://www.surveymonkey.com/s/RN8DR3D

Act now to save your favorite non-BP tools! Or help us get rid of the crud. Whichever way you want to look at it, we appreciate your feedback!


New job opening in cancer analysis (part-time) Nov 24

We're advertising this job on behalf of our colleagues in the cancer analysis team. See the overview below the fold. If you're interested, please message me (@Geraldine_VdAuwera) or apply on the Broad's careers page (search for job requisition number 1591).

Please note that this job requires on-site presence (no remote work) and Broad cannot offer visa sponsorship for this opportunity.


Read more...

Speed up HaplotypeCaller on IBM POWER8 systems Nov 17

We all know how HaplotypeCaller analyses can take a long time. IBM is now providing a native implementation of the PairHMM algorithm that leverages the new hardware available in their POWER8 systems. This optimization currently work on the following systems: Ubuntu14 and RHEL7 with POWER8.

To take advantage of this optimization, you need to do the following:

Here is an example for running on a P8 system with Ubuntu:

java -Xmx32g -Djava.library.path=/path/to/PairHMM_P8_Ubuntu -jar $GATK_PATH/GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-R $REFERENCE -I $INPUT_BAM --dbsnp $SNP_VCF \
-stand_emit_conf 10 -stand_call_conf 50 \
--pair_hmm_implementation VECTOR_LOGLESS_CACHING \
-o $OUTPUT_VCF

You can expect a speedup in the range of 1-1.7x depending on the hardware, OS and test cases.

If you have any questions or issues (aside from downloading the file), please contact Yinhue Cheng at IBM (ycheng@us.ibm.com).



Latest posts
 

Search blog by tag

2013 ad agbt14 appistry ashg ashg2014 baserecalibrator belgium best-practices beta blog brussels bug bug-fixed cancer catvariants challenge combinegvcfs combinevariants commandline commandlinegatk commercial compbio competition conferences denovo depthofcoverage diagnosetargets downtime error fastaalternatereferencemaker fix forum gatk gatk-3-0 gatk-3-2 gatk-lite gatk3 genotype genotypegvcfs genotyperefinement gsa gvcf haploid haplotypecaller hardware holiday ibm indelrealigner job job-offer jobs joint-analysis joint-discovery key license lite media meetings mendelianviolations multisample multithreading mutect nt pairhmm paper performance phone-home picard pipeline ploidy polyploid poster presentations press printreads queue randomlysplitvariants readbackedphasing reducereads reference-model release release-notes rnaseq search selectvariants service slides snow spam speed splitncigarreads status sting support syntax talks team third-party-tools topstory trivia troll unifiedgenotyper userstories validatevariants variantannotator variantrecalibrator variantstobinaryped version-highlights versions video videos vqsr webinar workshop


GATK Dev Team

@gatk_dev

@_ramrs You're right, that's a typo. Will put in to-fix.
23 Feb 15
Registration now open for #GATK workshop March 19-20 at @broadinstitute (Cambridge, MA) https://t.co/yFFGouNOHE
23 Feb 15
RT @ryanjtaft: Clinical genomicists - looking for patients with RUNX3 mutations. We've got a singleton case who needs answers fast. Help?#g…
12 Feb 15
#GATK in the UK : April 20-24 workshops https://t.co/EIIPyuz2WQ
11 Feb 15
@_ramrs Pretty sure we have some openings atm, feel free to check out http://t.co/UjXXPH4ufI
11 Feb 15