Tagged with #reducereads
1 documentation article | 2 events or announcements | 20 forum discussions


What is a synthetic read?

When running reduce reads, the algorithm will find regions of low variation in the genome and compress them together. To represent this compressed region, we use a synthetic read that carries all the information necessary to downstream tools to perform likelihood calculations over the reduced data.

They are called Synthetic because they are not read by a sequencer, these reads are automatically generated by the GATK and can be extremely long. In a synthetic read, each base will represent the consensus base for that genomic location. Each base will have it's consensus quality score represented in the equivalent offset in the quality score string.

Consensus Bases

ReduceReads has several filtering parameters for consensus regions. Consensus is created based on base qualities, mapping qualities and other adjustable parameters from the command line. All filters are described in the technical documentation of reduce reads.

Consensus Quality Scores

The consensus quality score of a consensus base is essentially the mean of all bases that passed all the filters and represent an observation of that base. It is represented in the quality score field of the SAM format.

n is the number of bases that contributed to the consensus base and q_i is the corresponding quality score of each base.

Insertion quality scores and Deletion quality scores (generated by BQSR) will undergo the same process and will be represented the same way.

Mapping Quality

The mapping quality of a synthetic read is a value representative of the mapping qualities of all the reads that contributed to it. This is an average of the root mean square of the mapping quality of all reads that contributed to the bases of the synthetic read. It is represented in the mapping quality score field of the SAM format.

where n is the number of reads and x_i is the mapping quality of each read.

Original Alignments

A synthetic read may come with up to two extra tags representing its original alignment information. Due to many filters in ReduceReads, reads are hard-clipped to the are of interest. These hard-clips are always represented in the cigar string with the H element and the length of the clipping in genomic coordinates. Sometimes hard clipping will make it impossible to retrieve what was the original alignment start / end of a read. In those cases, the read will contain extra tags with integer values representing their original alignment start or end.

Here are the two integer tags:

  • OP -- original alignment start
  • OE -- original alignment end

For all other reads, where this can still be obtained through the cigar string (i.e. using getAlignmentStart() or getUnclippedStart()), these tags are not created.

The RR Tag

the RR tag is a tag that holds the observed depth (after filters) of every base that contributed to a reduce read. That means all bases that passed the mapping and base quality filters, and had the same observation as the one in the reduced read.

The RR tag carries an array of bytes and for increased compression, it works like this: the first number represents the depth of the first base in the reduced read. all subsequent numbers will represent the offset depth from the first base. Therefore, to calculate the depth of base "i" using the RR array, one must use :

RR[0] + RR[i]

but make sure i > 0. Here is the code we use to return the depth of the i'th base:

return (i==0) ? firstCount : (byte) Math.min(firstCount + offsetCount, Byte.MAX_VALUE);


Using Synthetic Reads with GATK tools

The GATK is 100% compatible with synthetic reads. You can use Reduced BAM files in combination with non-reduced BAM files in any GATK analysis tools and it will work seamlessly.

Programming in the GATK

If you are programming using the GATK framework, the GATKSAMRecord class carries all the necessary functionality to use synthetic reads transparently with methods like:

  • public final byte getReducedCount(final int i)
  • public int getOriginalAlignmentStart()
  • public int getOriginalAlignmentEnd()
  • public boolean isReducedRead()

We have identified a major bug in ReduceReads -- GATK versions 2.0 and 2.1. The effect of the bug is that variant regions with more than 100 reads and fewer than 250 reads get downsampled to 0 reads.

This has now been fixed in the most recent release.

To check if you are using a buggy version, run the following:

    samtools view -H $BAM

This will produce the following output:

    @PG ID:GATK ReduceReads VN:XXX

If XXX is 2.0 or 2.1, any results obtained with your current version are suspect, and you will need to upgrade to the most recent version then rerun your processing.

Our most sincere apologies for the inconvenience.

GATK release 2.2 was released on October 31, 2012. Highlights are listed below. Read the detailed version history overview here: http://www.broadinstitute.org/gatk/guide/version-history

Base Quality Score Recalibration

  • Improved the algorithm around homopolymer runs to use a "delocalized context".
  • Massive performance improvements that allow these tools to run efficiently (and correctly) in multi-threaded mode.
  • Fixed bug where the tool failed for reads that begin with insertions.
  • Fixed bug in the scatter-gather functionality.
  • Added new argument to enable emission of the .pdf output file (see --plot_pdf_file).

Unified Genotyper

  • Massive runtime performance improvement for multi-allelic sites; -maxAltAlleles now defaults to 6.
  • The genotyper no longer emits the Stand Bias (SB) annotation by default. Use the --computeSLOD argument to enable it.
  • Added the ability to automatically down-sample out low grade contamination from the input bam files using the --contamination_fraction_to_filter argument; by default the value is set at 0.05 (5%).
  • Fixed annotations (AD, FS, DP) that were miscalculated when run on a Reduce Reads processed bam.
  • Fixed bug for the general ploidy model that occasionally caused it to choose the wrong allele when there are multiple possible alleles to choose from.
  • Fixed bug where the inbreeding coefficient was computed at monomorphic sites.
  • Fixed edge case bug where we could abort prematurely in the special case of multiple polymorphic alleles and samples with drastically different coverage.
  • Fixed bug in the general ploidy model where it wasn't counting errors in insertions correctly.
  • The FisherStrand annotation is now computed both with and without filtering low-qual bases (we compute both p-values and take the maximum one - i.e. least significant).
  • Fixed annotations (particularly AD) for indel calls; previous versions didn't accurately bin reads into the reference or alternate sets correctly.
  • Generalized ploidy model now handles reference calls correctly.

Haplotype Caller

  • Massive runtime performance improvement for multi-allelic sites; -maxAltAlleles now defaults to 6.
  • Massive runtime performance improvement to the HMM code which underlies the likelihood model of the HaplotypeCaller.
  • Added the ability to automatically down-sample out low grade contamination from the input bam files using the --contamination_fraction_to_filter argument; by default the value is set at 0.05 (5%).
  • Now requires at least 10 samples to merge variants into complex events.

Variant Annotator

  • Fixed annotations for indel calls; previous versions either didn't compute the annotations at all or did so incorrectly for many of them.

Reduce Reads

  • Fixed several bugs where certain reads were either dropped (fully or partially) or registered as occurring at the wrong genomic location.
  • Fixed bugs where in rare cases N bases were chosen as consensus over legitimate A,C,G, or T bases.
  • Significant runtime performance optimizations; the average runtime for a single exome file is now just over 2 hours.

Variant Filtration

  • Fixed a bug where DP couldn't be filtered from the FORMAT field, only from the INFO field.

Variant Eval

  • AlleleCount stratification now supports records with ploidy other than 2.

Combine Variants

  • Fixed bug where the AD field was not handled properly. We now strip the AD field out whenever the alleles change in the combined file.
  • Now outputs the first non-missing QUAL, not the maximum.

Select Variants

  • Fixed bug where the AD field was not handled properly. We now strip the AD field out whenever the alleles change in the combined file.
  • Removed the -number argument because it gave biased results.

Validate Variants

  • Added option to selectively choose particular strict validation options.
  • Fixed bug where mixed genotypes (e.g. ./1) would incorrectly fail.
  • improved the error message around unused ALT alleles.

Somatic Indel Detector

  • Fixed several bugs, including missing AD/DP header lines and putting annotations in correct order (Ref/Alt).

Miscellaneous

  • New CPU "nano" parallelization option (-nct) added GATK-wide (see docs for more details about this cool new feature that allows parallelization even for Read Walkers).
  • Fixed raw HapMap file conversion bug in VariantsToVCF.
  • Added GATK-wide command line argument (-maxRuntime) to control the maximum runtime allowed for the GATK.
  • Fixed bug in GenotypeAndValidate where it couldn't handle both SNPs and indels.
  • Fixed bug where VariantsToTable did not handle lists and nested arrays correctly.
  • Fixed bug in BCF2 writer for case where all genotypes are missing.
  • Fixed bug in DiagnoseTargets when intervals with zero coverage were present.
  • Fixed bug in Phase By Transmission when there are no likelihoods present.
  • Fixed bug in fasta .fai generation.
  • Updated and improved version of the BadCigar read filter.
  • Picard jar remains at version 1.67.1197.
  • Tribble jar remains at version 110.

Does the relationship between AD and DP stil hold in VCF produced from ReduceRead BAMs? That is the sum of AD is <= DP Or can other scenarios now occur?

Also is AD summarized to 1,0 or 0,1 for homozygous REF and ALT? Thanks.

Hi Team,

I have been running GATK2 ReduceReads on a large (100Gb) Bam file, and even though at the very beginning it runs very smoothly and predicts a week for finishing the task, after a few hours it gets totally stock. We first thought that it could be a garbage collection (or java memory allocation issue), but the logs show that the garbage collection works well.

The command is (similar behavior for smaller Xms and Xmx values) java -Xmx30g -Xms30g -XX:+PrintGCTimeStamps -XX:+UseParallelOldGC -XX:+PrintGCDetails -Xloggc:gc.log -verbose:gc -jar $path $ref -T ReduceReads -I input.bam -o output.bam

The first few lines of the log file are

INFO 01:12:21,541 TraversalEngine - chr1:1094599 5.89e+05 9.9 m 16.8 m 0.0% 19.4 d 19.4 d INFO 01:13:21,628 TraversalEngine - chr1:2112411 9.44e+05 10.9 m 11.6 m 0.1% 11.2 d 11.2 d INFO 01:14:22,065 TraversalEngine - chr1:3051535 1.29e+06 11.9 m 9.3 m 0.1% 8.5 d 8.5 d INFO 01:15:22,297 TraversalEngine - chr1:4084547 1.59e+06 12.9 m 8.1 m 0.1% 6.9 d 6.9 d INFO 01:16:24,130 TraversalEngine - chr1:4719991 1.82e+06 13.9 m 7.7 m 0.2% 6.4 d 6.4 d

but after a short while it gets totally stock, and even in the location 121485073 of chromosome 1, there is almost no progress at all, and the estimated finish time goes over 11 weeks, and still increasing.

Any idea what the reason for this could be, and how we can solve the problem? The same command runs successfully on small (less than 5gig) Bam files though

Thanks in advance. --Sina

Hello dear GATK Team,

when trying to run Haplotypecaller on my exome files prepared with ReduceReads i get the error stated below. As you can see the newest GATK Version is used. Also UnifiedGenotyper does not produce any errors on te exact same data (90 SOLiD exomes creatted according to Best Practice v4).

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Somehow the requested coordinate is not covered by the read. Too many deletions?
    at org.broadinstitute.sting.utils.sam.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:447)
    at org.broadinstitute.sting.utils.sam.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:396)
    at org.broadinstitute.sting.utils.sam.ReadUtils.getReadCoordinateForReferenceCoordinate(ReadUtils.java:392)
    at org.broadinstitute.sting.gatk.walkers.annotator.DepthOfCoverage.annotate(DepthOfCoverage.java:56)
    at org.broadinstitute.sting.gatk.walkers.annotator.interfaces.InfoFieldAnnotation.annotate(InfoFieldAnnotation.java:24)
    at org.broadinstitute.sting.gatk.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:223)
    at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:429)
    at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:104)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegion(TraverseActiveRegions.java:249)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.callWalkerMapOnActiveRegions(TraverseActiveRegions.java:204)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegions(TraverseActiveRegions.java:179)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:136)
    at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:29)
    at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146)
    at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 2.2-3-gde33222):
##### ERROR
##### ERROR Please visit the wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Somehow the requested coordinate is not covered by the read. Too many deletions?
##### ERROR ------------------------------------------------------------------------------------------

The Command line used (abbreviated):

java -Xmx30g -jar /home/common/GenomeAnalysisTK-2.2-3/GenomeAnalysisTK.jar \
  -R /home/common/hg19/ucschg19/ucsc.hg19.fasta \
  -T HaplotypeCaller \
  -I ReduceReads/XXXXX.ontarget.MarkDups.nRG.reor.Real.Recal.reduced.bam [x90]\
  --dbsnp /home/common/hg19/dbsnp_135.hg19.vcf \
  -o 93Ind_ped_reduced_HC_snps.raw.vcf \
  -ped familys.ped \
  --pedigreeValidationType SILENT \
  -stand_call_conf 20.0 \
  -stand_emit_conf 10.0

Hi,

when I run ReduceReads I get the following exception just when it's supposed to finish:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.util.NoSuchElementException at java.util.LinkedList$ListItr.next(Unknown Source) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.updateHeaderCounts(SlidingWindow.java:697) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.addRead(SlidingWindow.java:128) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SingleSampleCompressor.addAlignment(SingleSampleCompressor.java:73) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.MultiSampleCompressor.addAlignment(MultiSampleCompressor.java:70) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReadsStash.compress(ReduceReadsStash.java:67) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReads.reduce(ReduceReads.java:347) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReads.reduce(ReduceReads.java:86) at org.broadinstitute.sting.gatk.traversals.TraverseReads.traverse(TraverseReads.java:107) at org.broadinstitute.sting.gatk.traversals.TraverseReads.traverse(TraverseReads.java:52) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:71) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:269) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.0-21-ga40b695):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

I run it with the standard arguments: java -jar GenomAnalysisTK.jar \ --logging_level ERROR \ -R hg19.fa \ -T ReduceReads \ -I in.bam \ -o reduced.in.bam

Anny suggestions?

Thanks, Thomas

Hi,

I'm trying to use GATK release2.0 with my nine exome-seq samples, following the steps on best practice I generated per-sample, ready-to-process .bam files and then used -T ReduceReads to generate .reduced.bam files for the next step (-T UnifiedGenotyper). When using these .reduced.bam files as UG input I receive this error message: "##### ERROR MESSAGE: Somehow the requested coordinate is not covered by the read. Too many deletions?" if I take my original .bam files as input things work smoothly. Do you have any idea what causes the problem?

Thanks a lot, Samira

Here is the command lines I use:

java -Xmx4g -jar $GATKv4 \
    -R $GATK_BUNDLE/ucsc.hg19.fasta \
    -T ReduceReads \
    -L $capture_library.bed \
    -I $i.recal_s.bam \
    -o $i.reduced.bam

 java -jar $GATKv4 \
    -T HaplotypeCaller \
    -R $GATK_BUNDLE/ucsc.hg19.fasta \
    -I InputReducedBams.list \
    -L $capture_library.bed \
    --dbsnp GATK_BUNDLE/dbsnp_135.hg19.vcf \
    -o raw.snp.indel.UnifiedGenotyper.rsv.vcf

Hi, I'm running GATK version 2.1-8 with reads mapped to mm10. ReduceReads fails somewhere on chr2 with above message. From previous posts I understood that this bug has appeared already? Could you please help me to fix it? Thank you,

Ania

Hi, I'm just wondering if it is a good idea to run my pipeline again with ReduceReads. I skipped it originally as I only have four (mouse) samples but having re-read the documentation with the additional filters, I am now considering if it might add value. Any thoughts appreciated.

Hi there, I've tried to run ReduceReads for the first time and I got this error:

`$ java -Xmx8g -jar /lustre1/tools/bin/GenomeAnalysisTK-2.3-6.jar -T ReduceReads -R /lustre1/genomes/hg19/fa/hg19.fa -I filein.bam -o fileout.bam […]

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NullPointerException at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SingleSampleCompressor.closeVariantRegions(SingleSampleCompressor.java:83) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.MultiSampleCompressor.closeVariantRegionsInAllSamples(MultiSampleCompressor.java:94) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.MultiSampleCompressor.addAlignment(MultiSampleCompressor.java:76) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReadsStash.compress(ReduceReadsStash.java:67) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReads.reduce(ReduceReads.java:387) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReads.reduce(ReduceReads.java:87) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsReduce.apply(TraverseReadsNano.java:226) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsReduce.apply(TraverseReadsNano.java:215) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:254) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:219) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:91) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:55) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.3-6-gebbba25):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

` Is there something wrong with running RR with multiple samples?

d

I have read on your recent slides for "Data Compression with Reduce Reads" that "Tumor and Normal samples (or any set of samples) get co-­‐reduced, meaning that every variable region triggered by one sample will be forced in every sample."

I have data from 4 variant strains of an organism, my samples in RG info, and 4 individuals for each strain, my libraries in RG info. Currently I have a bam file for each of the 16 different libraries.

If I want to run ReduceReads as I have quite high coverage, but preserve information across all of my samples where a site is not consensus in just one as there is no snp information available for this organism and I don't want to lose any important data. Should I merge all bam files for all samples before proceeding with ReduceReads with downsampling turned off? Or just leave out ReduceReads?

Thanks Anna

Hallo everyone, I have a question about ReduceReads when using scatter/gather. In the argument details of ReduceReads you write for the parameter -nocmp_names: "... If you scatter/gather there is no guarantee that read name uniqueness will be maintained -- in this case we recommend not compressing."

Do you mean, that if I use scatter/gather, I should use ReduceReads with the -nocmp_names option so that the read names will not be compressed OR do you mean that I should not use ReduceReads at all when scatter/gathering.

I assume the first is meant, I just wanted to make sure. Thank you for your time and effort. Eva

HI all,

I am analyzing some whole genome sequencing datas .After preprocessing by Queue got a large bam file on sample level (~ 200GB/sample ) and I wanted to use ReaduceReads module to reduce the bam file size. and running following command: /usr/java/latest/bin/java -Xmx16g -jar /path_to_GenomeAnalysisTK-2.3-9/GenomeAnalysisTK.jar -R /path_to_human_g1k_v37.fasta -T ReduceReads -I /path_to_Queue/project.sample.clean.dedup.recal.bam -o sample.reduced.bam --generate_md5

After 8 hours , the estimated time goes to 6.9 days.

INFO 20:02:25,508 ProgressMeter - 1:120660726 5.63e+07 6.5 h 7.0 m 3.9% 7.0 d 6.7 d INFO 20:03:25,509 ProgressMeter - 1:120660726 5.63e+07 6.5 h 7.0 m 3.9% 7.0 d 6.7 d INFO 20:04:25,510 ProgressMeter - 1:120660726 5.63e+07 6.6 h 7.0 m 3.9% 7.0 d 6.8 d INFO 20:05:25,511 ProgressMeter - 1:120660726 5.63e+07 6.6 h 7.0 m 3.9% 7.0 d 6.8 d INFO 20:06:25,512 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.0 m 3.9% 7.1 d 6.8 d INFO 20:07:25,528 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.0 m 3.9% 7.1 d 6.8 d INFO 20:08:25,529 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.1 m 3.9% 7.1 d 6.8 d INFO 20:09:25,530 ProgressMeter - 1:120677835 5.63e+07 6.6 h 7.1 m 3.9% 7.1 d 6.8 d INFO 20:10:25,531 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.1 m 3.9% 7.1 d 6.9 d INFO 20:11:25,532 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.1 m 3.9% 7.2 d 6.9 d INFO 20:12:25,533 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.1 m 3.9% 7.2 d 6.9 d INFO 20:13:25,534 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.2 m 3.9% 7.2 d 6.9 d INFO 20:14:25,535 ProgressMeter - 1:120677835 5.63e+07 6.7 h 7.2 m 3.9% 7.2 d 6.9 d

The tool version is GenomeAnalysisTK-2.3-9

Is there anything wrong with my command ? How could I speed up this procedure? Thanks a lot .

I'm working with ReduceReads and would like to use it in some kind of parallel mode. The presentation mentions that a 50x way run may drastically reduce run time but I'm not sure how to invoke this. I tried -nt and it complained. Should I be giving it multiple intervals and merging? If so, how does it deal with edge variants?

Thanks.

Hello all thank for the great work.

I have run into an issue with ReduceReads and I was hoping you could offer some insite. I'm getting the following stack trace issue (attached file). I looked around the forums, and others who were getting stack trace issue using ReduceReads were told code fixes would remedy the issue so I thought I would check with you. I also ran samtools flagstat.

ReduceReads is very slow for MT reads. After it gets by the MT, it runs much faster (See output below)

Any ideas why and what to do to speed it up?

John

INFO 23:32:37,536 HelpFormatter - --------------------------------------------------------------------------------- INFO 23:32:37,545 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.2-16-g9f648cb, Compiled 2012/12/04 03:46:58 INFO 23:32:37,545 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 23:32:37,545 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 23:32:37,551 HelpFormatter - Program Args: -R /unprotected/projects/genetics_program/resources/gatk_bundle/hg19/ucsc.hg19.fasta -T ReduceReads -I bam/LP6005113-DNA_E01.recal.bam -o LP6005113- DNA_E01.reduced.bam INFO 23:32:37,551 HelpFormatter - Date/Time: 2013/01/28 23:32:37 INFO 23:32:37,551 HelpFormatter - --------------------------------------------------------------------------------- INFO 23:32:37,551 HelpFormatter - --------------------------------------------------------------------------------- INFO 23:32:37,605 GenomeAnalysisEngine - Strictness is SILENT INFO 23:32:37,984 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 23:32:37,992 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 23:32:38,073 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.08 INFO 23:32:38,113 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 23:32:38,114 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining INFO 23:33:14,010 ProgressMeter - chrM:1992 4.00e+04 35.9 s 15.0 m 0.0% 93.5 w 93.5 w INFO 23:35:18,038 ProgressMeter - chrM:2879 6.00e+04 2.7 m 44.4 m 0.0% 288.2 w 288.2 w INFO 23:37:06,320 ProgressMeter - chrM:3259 7.00e+04 4.5 m 63.9 m 0.0% 427.0 w 427.0 w INFO 23:39:03,457 ProgressMeter - chrM:3662 8.00e+04 6.4 m 80.3 m 0.0% 546.0 w 546.0 w INFO 23:41:16,174 ProgressMeter - chrM:4087 9.00e+04 8.6 m 95.9 m 0.0% 657.7 w 657.7 w INFO 23:43:46,243 ProgressMeter - chrM:4550 1.00e+05 11.1 m 111.3 m 0.0% 761.8 w 761.8 w INFO 23:46:47,501 ProgressMeter - chrM:4973 1.10e+05 14.2 m 2.1 h 0.0% 886.1 w 886.1 w INFO 23:49:57,085 ProgressMeter - chrM:5379 1.20e+05 17.3 m 2.4 h 0.0% 1002.1 w 1002.1 w INFO 23:52:52,173 ProgressMeter - chrM:5823 1.30e+05 20.2 m 2.6 h 0.0% 1081.7 w 1081.7 w INFO 23:54:28,697 ProgressMeter - chrM:7492 1.70e+05 21.8 m 2.1 h 0.0% 907.5 w 907.5 w INFO 23:55:45,484 ProgressMeter - chrM:7883 1.80e+05 23.1 m 2.1 h 0.0% 913.0 w 913.0 w INFO 23:57:16,597 ProgressMeter - chrM:8305 1.90e+05 24.6 m 2.2 h 0.0% 923.5 w 923.5 w INFO 23:59:07,109 ProgressMeter - chrM:8731 2.00e+05 26.5 m 2.2 h 0.0% 944.1 w 944.1 w INFO 00:01:16,623 ProgressMeter - chrM:9124 2.10e+05 28.6 m 2.3 h 0.0% 977.1 w 977.1 w INFO 00:04:12,150 ProgressMeter - chrM:9526 2.20e+05 31.6 m 2.4 h 0.0% 1031.4 w 1031.4 w INFO 00:06:51,054 ProgressMeter - chrM:9896 2.30e+05 34.2 m 2.5 h 0.0% 1076.2 w 1076.2 w INFO 00:09:31,477 ProgressMeter - chrM:10244 2.40e+05 36.9 m 2.6 h 0.0% 1120.9 w 1120.9 w INFO 00:12:57,847 ProgressMeter - chrM:10626 2.50e+05 40.3 m 2.7 h 0.0% 1181.3 w 1181.3 w INFO 00:16:48,872 ProgressMeter - chrM:11139 2.60e+05 44.2 m 2.8 h 0.0% 1234.5 w 1234.5 w INFO 00:20:54,282 ProgressMeter - chrM:11634 2.70e+05 48.3 m 3.0 h 0.0% 1291.4 w 1291.4 w INFO 00:25:23,381 ProgressMeter - chrM:12098 2.80e+05 52.8 m 3.1 h 0.0% 1357.2 w 1357.2 w INFO 00:30:01,695 ProgressMeter - chrM:12464 2.90e+05 57.4 m 3.3 h 0.0% 1433.2 w 1433.2 w INFO 00:34:41,008 ProgressMeter - chrM:12805 3.00e+05 62.0 m 3.4 h 0.0% 1508.2 w 1508.2 w INFO 00:39:41,462 ProgressMeter - chrM:13307 3.10e+05 67.1 m 3.6 h 0.0% 1568.4 w 1568.4 w INFO 00:45:21,827 ProgressMeter - chrM:13764 3.20e+05 72.7 m 3.8 h 0.0% 1644.6 w 1644.6 w INFO 00:51:15,645 ProgressMeter - chrM:14173 3.30e+05 78.6 m 4.0 h 0.0% 1726.7 w 1726.7 w INFO 00:57:38,039 ProgressMeter - chrM:14639 3.40e+05 85.0 m 4.2 h 0.0% 1807.2 w 1807.2 w INFO 01:06:06,413 ProgressMeter - chrM:15067 3.50e+05 93.5 m 4.5 h 0.0% 1930.9 w 1930.9 w INFO 01:15:07,742 ProgressMeter - chrM:15463 3.60e+05 102.5 m 4.7 h 0.0% 2063.0 w 2063.0 w INFO 01:23:17,067 ProgressMeter - chrM:15827 3.70e+05 110.6 m 5.0 h 0.0% 2176.0 w 2176.0 w INFO 01:31:08,225 ProgressMeter - chrM:16237 3.80e+05 118.5 m 5.2 h 0.0% 2271.6 w 2271.5 w INFO 01:32:08,631 ProgressMeter - chr1:3000534 1.17e+06 119.5 m 102.5 m 0.1% 12.3 w 12.3 w INFO 01:33:09,058 ProgressMeter - chr1:5169965 1.91e+06 2.0 h 63.2 m 0.2% 7.2 w 7.2 w INFO 01:34:09,530 ProgressMeter - chr1:7090404 2.65e+06 2.0 h 45.9 m 0.2% 5.3 w 5.3 w INFO 01:35:10,334 ProgressMeter - chr1:8806475 3.32e+06 2.0 h 37.0 m 0.3% 4.3 w 4.3 w INFO 01:36:10,654 ProgressMeter - chr1:10887467 4.08e+06 2.1 h 30.3 m 0.3% 3.5 w 3.5 w INFO 01:37:10,892 ProgressMeter - chr1:12756332 4.77e+06 2.1 h 26.1 m 0.4% 3.0 w 3.0 w INFO 01:38:11,087 ProgressMeter - chr1:14746000 5.29e+06 2.1 h 23.8 m 0.5% 18.5 d 18.4 d INFO 01:39:11,327 ProgressMeter - chr1:16699493 6.02e+06 2.1 h 21.0 m 0.5% 16.5 d 16.4 d INFO 01:40:11,606 ProgressMeter - chr1:18706430 6.86e+06 2.1 h 18.6 m 0.6% 14.8 d 14.8 d

I've tried using the output of the Reduced Bams as an input to Crest (after some preprocessing) but it hangs on chr7. Has anyone else used the reduced bam in other programs? Is this output meant to only be used in GATK?

Thanks!

New gatk version... trying out ReduceReads again.

6 of 8 exomes I tried were processed by ReduceReads just fine, but two throw the exception Removed too many insertions, header is now negative! (at different genomic locations).

I did not find any mention of this error in the GATK forums, is this a known problem?

Command line: java -Xmx6g -jar GenomeAnalysisTK.jar -R human_g1k_v37.fasta -T ReduceReads -o test.rr.bam -I rr-too-many-insertions.bam

java -v: java version "1.6.0_27" Java(TM) SE Runtime Environment (build 1.6.0_27-b07) Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)

Run log:

INFO  16:03:26,898 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  16:03:27,382 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.3-0-g9593e74, Compiled 2012/12/17 16:58:19 
INFO  16:03:27,383 HelpFormatter - Copyright (c) 2010 The Broad Institute 
INFO  16:03:27,383 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
INFO  16:03:27,388 HelpFormatter - Program Args: -R human_g1k_v37.fasta -T ReduceReads -o test.rr.bam -I rr-too-many-insertions.bam 
INFO  16:03:27,388 HelpFormatter - Date/Time: 2012/12/18 16:03:26 
INFO  16:03:27,388 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  16:03:27,388 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  16:03:27,471 GenomeAnalysisEngine - Strictness is SILENT 
INFO  16:03:27,577 GenomeAnalysisEngine - Downsampling Settings: No downsampling 
INFO  16:03:27,585 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
INFO  16:03:27,620 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.03 
INFO  16:03:27,656 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
INFO  16:03:27,657 ProgressMeter -        Location processed.reads  runtime per.1M.reads completed total.runtime remaining 
INFO  16:03:27,714 ReadShardBalancer$1 - Loading BAM index data for next contig 
INFO  16:03:27,717 ReadShardBalancer$1 - Done loading BAM index data for next contig 
INFO  16:03:27,739 ReadShardBalancer$1 - Loading BAM index data for next contig 
INFO  16:03:28,739 GATKRunReport - Uploaded run statistics report to AWS S3 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Removed too many insertions, header is now negative!
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.HeaderElement.removeInsertionToTheRight(HeaderElement.java:151)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.updateHeaderCounts(SlidingWindow.java:881)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.removeFromHeader(SlidingWindow.java:816)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.compressVariantRegion(SlidingWindow.java:604)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.closeVariantRegion(SlidingWindow.java:623)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.closeVariantRegions(SlidingWindow.java:643)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SingleSampleCompressor.closeVariantRegions(SingleSampleCompressor.java:83)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.MultiSampleCompressor.closeVariantRegionsInAllSamples(MultiSampleCompressor.java:94)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.MultiSampleCompressor.addAlignment(MultiSampleCompressor.java:76)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReadsStash.compress(ReduceReadsStash.java:67)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReads.reduce(ReduceReads.java:387)
    at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReads.reduce(ReduceReads.java:87)
    at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsReduce.apply(TraverseReadsNano.java:226)
    at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsReduce.apply(TraverseReadsNano.java:215)
    at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:254)
    at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:219)
    at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:91)
    at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:55)
    at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147)
    at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:94)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 2.3-0-g9593e74):
##### ERROR
##### ERROR Please visit the wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Removed too many insertions, header is now negative!
##### ERROR ------------------------------------------------------------------------------------------

(there is no progress listed here because this log is from after I bisected to find a narrow region where the problem is occuring).

I am using UnifiedGenotyper to call SNPs in certain regions from custom capture data. I previously had the pipeline working, but now I am trying with files that have been reduced using ReduceReads, and also changed to a newer version. I have many bam files, but I also get the error when I try with just two. See below for my script and the error message.

Many thanks.

java -Xmx20g -jar GenomeAnalysisTK.jar -T UnifiedGenotyper \ -R human_g1k_v37.fasta \ -B:dbsnp,vcf dbsnp_132.b37.vcf \ -L baitgroupfile.picard \ -I file1.reduced.bam \ -I file2.reduced.bam \ -o out.vcf \ -stand_call_conf 50.0 \ -stand_emit_conf 10.0 \ -G Standard \ -metrics out.metrics

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

net.sf.samtools.SAMFormatException: Unrecognized tag type: B at net.sf.samtools.BinaryTagCodec.readValue(BinaryTagCodec.java:270) at net.sf.samtools.BinaryTagCodec.readTags(BinaryTagCodec.java:220) at net.sf.samtools.BAMRecord.decodeAttributes(BAMRecord.java:302) at net.sf.samtools.BAMRecord.getAttribute(BAMRecord.java:282) at net.sf.samtools.SAMRecord.getAttribute(SAMRecord.java:830) at net.sf.picard.sam.MergingSamRecordIterator.next(MergingSamRecordIterator.java:132) at net.sf.picard.sam.MergingSamRecordIterator.next(MergingSamRecordIterator.java:39) at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:100) at org.broadinstitute.sting.gatk.iterators.PrivateStringSAMCloseableIterator.next(StingSAMIteratorAdapter.java:84) at org.broadinstitute.sting.gatk.datasources.simpleDataSources.SAMDataSource$ReleasingIterator.next(SAMDataSource.java:803) at org.broadinstitute.sting.gatk.datasources.simpleDataSources.SAMDataSource$ReleasingIterator.next(SAMDataSource.java:769) at org.broadinstitute.sting.gatk.iterators.ReadFormattingIterator.next(ReadFormattingIterator.java:77) at org.broadinstitute.sting.gatk.iterators.ReadFormattingIterator.next(ReadFormattingIterator.java:19) at org.broadinstitute.sting.gatk.filters.CountingFilteringIterator.getNextRecord(CountingFilteringIterator.java:106) at org.broadinstitute.sting.gatk.filters.CountingFilteringIterator.(CountingFilteringIterator.java:59) at org.broadinstitute.sting.gatk.datasources.simpleDataSources.SAMDataSource.applyDecoratingIterators(SAMDataSource.java:598) at org.broadinstitute.sting.gatk.datasources.simpleDataSources.SAMDataSource.getIterator(SAMDataSource.java:517) at org.broadinstitute.sting.gatk.datasources.simpleDataSources.SAMDataSource.seek(SAMDataSource.java:462) at org.broadinstitute.sting.gatk.executive.MicroScheduler.getReadIterator(MicroScheduler.java:150) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:56) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:209) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:109) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:239) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:87)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 1.0.4905):
ERROR
ERROR Please visit to wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
ERROR
ERROR MESSAGE: Unrecognized tag type: B
ERROR ------------------------------------------------------------------------------------------

Hello,

Im trying to call variants using UnifiedGenotyper on ca 450 reduced bams in 100000 bp chunks. It works fine for some of the chunks, but for others I get the following error message:

ERROR MESSAGE: SAM/BAM nameOfBam.bam is malformed: Unexpected compressed block length: 1

Can anyone explain to me why there is a problem with a specific bam file when I call on for example chunk chr20:25400000-25500000 but not when I call on chunk chr20:10000000-10100000?

Thank you, Tota

We are attempting to see if using ReducedReads will help with the overwhelming file sizes for the SNP calling we are doing on whole genome BAM files. We have been using a protocol similar to the one described in best practices document: Best: multi-sample realignment with known sites and recalibration. My question is what is the best point in the pipeline to use ReducedReads?

Hi all, I am trying to use the new feature "reduceReads" and I get an error everytime. Can anyone tell me what is the problem? BTW, I am working on yeast's genome and not human, if it is matter.

INFO 14:21:07,687 HelpFormatter - --------------------------------------------------------------------------------- INFO 14:21:07,688 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.0-36-gf5c1c1a, Compiled 2012/08/08 20:17:07 INFO 14:21:07,688 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 14:21:07,688 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 14:21:07,689 HelpFormatter - Program Args: -R /home/mps/references/SK1_v2/fasta/SK1_v2.fixed.fa -T ReduceReads -I output.marked.realigned.fixed.recal.bam -o output.marked.realigned.fixed.recal.reduced.bam -l INFO INFO 14:21:07,689 HelpFormatter - Date/Time: 2012/08/09 14:21:07 INFO 14:21:07,689 HelpFormatter - --------------------------------------------------------------------------------- INFO 14:21:07,690 HelpFormatter - --------------------------------------------------------------------------------- INFO 14:21:07,759 GenomeAnalysisEngine - Strictness is SILENT INFO 14:21:07,791 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 14:21:07,804 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01 INFO 14:21:08,076 TraversalEngine - [INITIALIZATION COMPLETE; TRAVERSAL STARTING] INFO 14:21:08,076 TraversalEngine - Location processed.reads runtime per.1M.reads completed total.runtime remaining INFO 14:21:38,548 TraversalEngine - SK1.chr01:63354 3.90e+04 30.5 s 13.0 m 0.5% 98.2 m 97.7 m INFO 14:22:08,706 TraversalEngine - SK1.chr01:79167 5.20e+04 60.6 s 19.4 m 0.6% 2.6 h 2.6 h INFO 14:22:38,976 TraversalEngine - SK1.chr01:98653 6.90e+04 90.9 s 22.0 m 0.8% 3.1 h 3.1 h INFO 14:23:10,903 TraversalEngine - SK1.chr01:114413 8.20e+04 2.0 m 25.0 m 0.9% 3.7 h 3.6 h INFO 14:23:43,523 TraversalEngine - SK1.chr01:125477 9.20e+04 2.6 m 28.2 m 1.0% 4.2 h 4.2 h INFO 14:24:15,215 TraversalEngine - SK1.chr01:145667 1.09e+05 3.1 m 28.6 m 1.2% 4.4 h 4.3 h INFO 14:24:45,785 TraversalEngine - SK1.chr01:163339 1.23e+05 3.6 m 29.5 m 1.3% 4.5 h 4.5 h INFO 14:25:17,660 TraversalEngine - SK1.chr01:179555 1.46e+05 4.2 m 28.5 m 1.5% 4.7 h 4.7 h INFO 14:25:49,088 TraversalEngine - SK1.chr01:213605 1.71e+05 4.7 m 27.4 m 1.7% 4.5 h 4.4 h INFO 14:25:51,716 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.ArithmeticException: / by zero at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.downsampleVariantRegion(SlidingWindow.java:539) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.closeVariantRegion(SlidingWindow.java:498) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.closeVariantRegions(SlidingWindow.java:520) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SlidingWindow.close(SlidingWindow.java:562) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.SingleSampleCompressor.addAlignment(SingleSampleCompressor.java:64) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.MultiSampleCompressor.addAlignment(MultiSampleCompressor.java:70) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReadsStash.compress(ReduceReadsStash.java:67) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReads.reduce(ReduceReads.java:344) at org.broadinstitute.sting.gatk.walkers.compression.reducereads.ReduceReads.reduce(ReduceReads.java:83) at org.broadinstitute.sting.gatk.traversals.TraverseReads.traverse(TraverseReads.java:107) at org.broadinstitute.sting.gatk.traversals.TraverseReads.traverse(TraverseReads.java:52) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:71) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:269) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.0-36-gf5c1c1a):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: / by zero
ERROR ------------------------------------------------------------------------------------------