Base Quality Score Recalibration

  • Multi-threaded support in the BaseRecalibrator tool has been temporarily suspended for performance reasons; we hope to have this fixed for the next release.
  • Implemented support for SOLiD no call strategies other than throwing an exception.
  • Fixed smoothing in the BQSR bins.
  • Fixed plotting R script to be compatible with newer versions of R and ggplot2 library.

Unified Genotyper

  • Renamed the per-sample ML allelic fractions and counts so that they don't have the same name as the per-site INFO fields, and clarified the description in the VCF header.
  • UG now makes use of base insertion and base deletion quality scores if they exist in the reads (output from BaseRecalibrator).
  • Changed the -maxAlleles argument to -maxAltAlleles to make it more accurate.
  • In pooled mode, if haplotypes cannot be created from given alleles when genotyping indels (e.g. too close to contig boundary, etc.) then do not try to genotype.
  • Added improvements to indel calling in pooled mode: we compute per-read likelihoods in reference sample to determine whether a read is informative or not.

Haplotype Caller

  • Added LowQual filter to the output when appropriate.
  • Added some support for calling on Reduced Reads. Note that this is still experimental and may not always work well.
  • Now does a better job of capturing low frequency branches that are inside high frequency haplotypes.
  • Updated VQSR to work with the MNP and symbolic variants that are coming out of the HaplotypeCaller.
  • Made fixes to the likelihood based LD calculation for deciding when to combine consecutive events.
  • Fixed bug where non-standard bases from the reference would cause errors.
  • Better separation of arguments that are relevant to the Unified Genotyper but not the Haplotype Caller.

Reduce Reads

  • Fixed bug where reads were soft-clipped beyond the limits of the contig and the tool was failing with a NoSuchElement exception.
  • Fixed divide by zero bug when downsampler goes over regions where reads are all filtered out.
  • Fixed a bug where downsampled reads were not being excluded from the read window, causing them to trail back and get caught by the sliding window exception.

Variant Eval

  • Fixed support in the AlleleCount stratification when using the MLEAC (it is now capped by the AN).
  • Fixed incorrect allele counting in IndelSummary evaluation.

Combine Variants

  • Now outputs the first non-MISSING QUAL, instead of the maximum.
  • Now supports multi-threaded running (with the -nt argument).

Select Variants

  • Fixed behavior of the --regenotype argument to do proper selecting (without losing any of the alternate alleles).
  • No longer adds the DP INFO annotation if DP wasn't used in the input VCF.
  • If MLEAC or MLEAF is present in the original VCF and the number of samples decreases, remove those annotations from the output VC (since they are no longer accurate).

Miscellaneous

  • Updated and improved the BadCigar read filter.
  • GATK now generates a proper error when a gzipped FASTA is passed in.
  • Various improvements throughout the BCF2-related code.
  • Removed various parallelism bottlenecks in the GATK.
  • Added support of X and = CIGAR operators to the GATK.
  • Catch NumberFormatExceptions when parsing the VCF POS field.
  • Fixed bug in FastaAlternateReferenceMaker when input VCF has overlapping deletions.
  • Fixed AlignmentUtils bug for handling Ns in the CIGAR string.
  • We now allow lower-case bases in the REF/ALT alleles of a VCF and upper-case them.
  • Added support for handling complex events in ValidateVariants.
  • Picard jar remains at version 1.67.1197.
  • Tribble jar remains at version 110.

Comment on this article in the forum



At a glance


Follow us on Twitter

GATK Dev Team

@gatk_dev

@geoffjentry @dgmacarthur Hey now. There was plenty of intentional, unforced walking. https://t.co/boX29R6L2H https://t.co/dSSxAiGM3u
12 Feb 16
#GATK Workshops World Tour 2016 - Dates and locations https://t.co/LoWumO9Vy7
12 Feb 16
.@dgmacarthur Our workshop crew found walking on the left side of the street very challenging.
12 Feb 16
@SEQquestions The role of read groups is explained in the GATK documentation https://t.co/JZsUM1iYov
10 Feb 16
RT @edgenome: Registration now open for GATK Best Practices for Variant Discovery, April 14-15th. Details: https://t.co/KQrX2V3bGX https://…
10 Feb 16

Our favorite tweets from others


12 Feb 16
See more of our favorite tweets...
Search blog by tag

ad appistry ashg best-practices blog bug bug-fixed cancer catvariants challenge cloud combinegvcfs combinevariants commandline commandlinegatk commercial competition conferences cram denovo depthofcoverage diagnosetargets error fastaalternatereferencemaker fix gatk3 genotype genotype-refinement genotypegvcfs google gvcf haploid haplotypecaller hardware hiring holiday htsjdk ibm job job-offer jobs joint-analysis joint-discovery key license media meetings mendelianviolations multisample multithreading mutect mutect2 nt outreach pairhmm paper patch performance phone-home picard plans ploidy polyploid poster presentations press printreads promote queue randomlysplitvariants readbackedphasing reducereads reference-model release release-notes rnaseq search selectvariants service slides snow speed splitncigarreads status sting support syntax talks team third-party-tools topstory trivia troll tutorial unifiedgenotyper variantannotator variantrecalibrator version-highlights versions video videos vqsr webinar workshop