GATK 2.8 was released on December 6, 2013. Highlights are listed below. Read the detailed version history overview here: http://www.broadinstitute.org/gatk/guide/version-history

Note that this release is relatively smaller than previous ones. We are working hard on some new tools and frameworks that we are hoping to make available to everyone for our next release.


Unified Genotyper

  • Fixed bug where indels in very long reads were sometimes being ignored and not used by the caller.

Haplotype Caller

  • Improved the indexing scheme for gVCF outputs using the reference calculation model.
  • The reference calculation model now works with reduced reads.
  • Fixed bug where an error was being generated at certain homozygous reference sites because the whole assembly graph was getting pruned away.
  • Fixed bug for homozygous reference records that aren't GVCF blocks and were being treated incorrectly.

Variant Recalibrator

  • Disable tranche plots in INDEL mode.
  • Various VQSR optimizations in both runtime and accuracy. Some particular details include: for very large whole genome datasets with over 2M variants overlapping the training data randomly downsample the training set that gets used to build; annotations are ordered by the difference in means between known and novel instead of by their standard deviation; removed the training set quality score threshold; now uses 2 gaussians by default for the negative model; numBad argument has been removed and the cutoffs are now chosen by the model itself by looking at the LOD scores.

Reduce Reads

  • Fixed bug where mapping quality was being treated as a byte instead of an int, which caused high MQs to be treated as negative.

Diagnose Targets

  • Added calculation for GC content.
  • Added an option to filter the bases based on their quality scores.

Combine Variants

  • Fixed bug where annotation values were parsed as Doubles when they should be parsed as Integers due to implicit conversion; submitted by Michael McCowan.

Select Variants

  • Changed the behavior for PL/AD fields when it encounters a record that has lost one or more alternate alleles: instead of stripping them out these fields now get fixed.

Miscellaneous

  • SplitSamFile now produces an index with the BAM.
  • Length metric updates to QualifyMissingIntervals.
  • Provide close methods to clean up resources used while creating AlignmentContexts from BAM file regions; submitted by Brad Chapman.
  • Picard jar updated to version 1.104.1628.
  • Tribble jar updated to version 1.104.1628.
  • Variant jar updated to version 1.104.1628.

Return to top

LouisB


Two things. 1. This post needs a release notes tag, it doesn't show up in the list of release notes. 2. I think there might be a few changes missing from the release notes. I don't believe the Queue update from scala 2.9 -> 2.10 was included in any of 2.7 releases, but it is in the 2.8. It should be mentioned somewhere since is has the potential to break existing scripts.

Fri 6 Dec 2013

Geraldine_VdAuwera


Thanks for pointing this out, Louis. I've added the tag so it should show up now (when the website cache resets). Regarding your second point, historically we haven't included changes to Queue in the GATK release notes, but we agreed today in group meeting that it would be a good idea to start doing so going forward.

Fri 6 Dec 2013



Fri 6 Dec 2013


At a glance


Follow us on Twitter

GATK Dev Team

@gatk_dev

RT @BroadGenomics: Great data being showcased at #AGBT16 ! Will have all the posters uploaded on our website soon https://t.co/AuPkablqIz
12 Feb 16
@geoffjentry @dgmacarthur Hey now. There was plenty of intentional, unforced walking. https://t.co/boX29R6L2H https://t.co/dSSxAiGM3u
12 Feb 16
#GATK Workshops World Tour 2016 - Dates and locations https://t.co/LoWumO9Vy7
12 Feb 16
.@dgmacarthur Our workshop crew found walking on the left side of the street very challenging.
12 Feb 16
@SEQquestions The role of read groups is explained in the GATK documentation https://t.co/JZsUM1iYov
10 Feb 16

Our favorite tweets from others

Finally I understand what "cigar"means haha! Amazing Gatk workshop in Melbourne!@broadinstitute #gatk #Bioinformatics
4 Feb 16
I fucked up Git so bad it turned into Guitar Hero https://t.co/vUKZJAQKWg
1 Feb 16
Some parts of the GATK pages are great. On filtering: "whichever option you go with, you're going to suffer"
28 Jan 16
@KMS_Meltzy @konradjk Awwww, 20,000 exomes? That's cute.
26 Jan 16
@ksamuk @broadinstitute I know!. @gatk_dev made it so easy to use without really understanding, but am VERY happy for the knowledge gain!
12 Jan 16
See more of our favorite tweets...
Search blog by tag

ad appistry ashg best-practices blog bug bug-fixed cancer catvariants challenge cloud combinegvcfs combinevariants commandline commandlinegatk commercial competition conferences cram denovo depthofcoverage diagnosetargets error fastaalternatereferencemaker fix gatk3 genotype genotype-refinement genotypegvcfs google gvcf haploid haplotypecaller hardware hiring holiday htsjdk ibm job job-offer jobs joint-analysis joint-discovery key license media meetings mendelianviolations multisample multithreading mutect mutect2 nt outreach pairhmm paper patch performance phone-home picard plans ploidy polyploid poster presentations press printreads promote queue randomlysplitvariants readbackedphasing reducereads reference-model release release-notes rnaseq search selectvariants service slides snow speed splitncigarreads status sting support syntax talks team third-party-tools topstory trivia troll tutorial unifiedgenotyper variantannotator variantrecalibrator version-highlights versions video videos vqsr webinar workshop