Release notes for GATK version 2.1

Mon 20 Aug 2012
Share

Base Quality Score Recalibration

  • Multi-threaded support in the BaseRecalibrator tool has been temporarily suspended for performance reasons; we hope to have this fixed for the next release.
  • Implemented support for SOLiD no call strategies other than throwing an exception.
  • Fixed smoothing in the BQSR bins.
  • Fixed plotting R script to be compatible with newer versions of R and ggplot2 library.

Unified Genotyper

  • Renamed the per-sample ML allelic fractions and counts so that they don't have the same name as the per-site INFO fields, and clarified the description in the VCF header.
  • UG now makes use of base insertion and base deletion quality scores if they exist in the reads (output from BaseRecalibrator).
  • Changed the -maxAlleles argument to -maxAltAlleles to make it more accurate.
  • In pooled mode, if haplotypes cannot be created from given alleles when genotyping indels (e.g. too close to contig boundary, etc.) then do not try to genotype.
  • Added improvements to indel calling in pooled mode: we compute per-read likelihoods in reference sample to determine whether a read is informative or not.

Haplotype Caller

  • Added LowQual filter to the output when appropriate.
  • Added some support for calling on Reduced Reads. Note that this is still experimental and may not always work well.
  • Now does a better job of capturing low frequency branches that are inside high frequency haplotypes.
  • Updated VQSR to work with the MNP and symbolic variants that are coming out of the HaplotypeCaller.
  • Made fixes to the likelihood based LD calculation for deciding when to combine consecutive events.
  • Fixed bug where non-standard bases from the reference would cause errors.
  • Better separation of arguments that are relevant to the Unified Genotyper but not the Haplotype Caller.

Reduce Reads

  • Fixed bug where reads were soft-clipped beyond the limits of the contig and the tool was failing with a NoSuchElement exception.
  • Fixed divide by zero bug when downsampler goes over regions where reads are all filtered out.
  • Fixed a bug where downsampled reads were not being excluded from the read window, causing them to trail back and get caught by the sliding window exception.

Variant Eval

  • Fixed support in the AlleleCount stratification when using the MLEAC (it is now capped by the AN).
  • Fixed incorrect allele counting in IndelSummary evaluation.

Combine Variants

  • Now outputs the first non-MISSING QUAL, instead of the maximum.
  • Now supports multi-threaded running (with the -nt argument).

Select Variants

  • Fixed behavior of the --regenotype argument to do proper selecting (without losing any of the alternate alleles).
  • No longer adds the DP INFO annotation if DP wasn't used in the input VCF.
  • If MLEAC or MLEAF is present in the original VCF and the number of samples decreases, remove those annotations from the output VC (since they are no longer accurate).

Miscellaneous

  • Updated and improved the BadCigar read filter.
  • GATK now generates a proper error when a gzipped FASTA is passed in.
  • Various improvements throughout the BCF2-related code.
  • Removed various parallelism bottlenecks in the GATK.
  • Added support of X and = CIGAR operators to the GATK.
  • Catch NumberFormatExceptions when parsing the VCF POS field.
  • Fixed bug in FastaAlternateReferenceMaker when input VCF has overlapping deletions.
  • Fixed AlignmentUtils bug for handling Ns in the CIGAR string.
  • We now allow lower-case bases in the REF/ALT alleles of a VCF and upper-case them.
  • Added support for handling complex events in ValidateVariants.
  • Picard jar remains at version 1.67.1197.
  • Tribble jar remains at version 110.


Comment on this article in the forum


Search blog by tag

2013 ad agbt14 appistry baserecalibrator belgium best-practices blog bqsr broken-links brussels bug bug-fixed cancer catvariants challenge combinegvcfs combinevariants commercial community compbio competition conferences depthofcoverage diagnosetargets documentation downtime error fastaalternatereferencemaker forum gatk gatk-3-0 gatk-3-1 gatk-3-2 gatk-lite gatk2 gatk3 genotypegvcfs gsa gsa-announce gvcf haplotypecaller hardware holiday indelrealigner intel job job-offer jobs joint-analysis joint-discovery key license lite media meetings multisample multithreading mutect nt optimization pairhmm paper performance phone-home pipeline poster presentations press printreads queue randomlysplitvariants readbackedphasing reducereads reference-model release release-notes rnaseq scatter-gather selectvariants slides spam splitncigarreads support talks topstory trivia tutorials unifiedgenotyper userstories validatevariants variantannotator varianteval variantrecalibrator variantstobinaryped version-highlights video videos webinar workshop


GATK Dev Team

@gatk_dev

RT @edgenome: Today's attendees at our 'Linux for Genomics' workshop. Keep tuned for more info on forthcoming courses and events... http://…
25 Jul 14
RT @TheAllium: Broad Institute “Can’t be bothered anymore” with donations below $200 Million http://t.co/KOLby8OlXA via @theallium
24 Jul 14
RT @appistry: Join us tomorrow for our #webinar discussing the #PickMyPipeline Challenge! #NGS http://t.co/iK4s52xpzf
23 Jul 14
New Pipeline Challenge from @Appistry http://t.co/m9jWWmRn9g
23 Jul 14
@Elfsevier @jermdemo Isn't bro-ness itself a disease? Or do you mean diseases that are comorbid to bro-ness?
22 Jul 14