Tagged with #mutect
0 documentation articles | 3 announcements | 41 forum discussions


No posts found with the requested search criteria.

Created 2015-11-25 07:37:00 | Updated 2015-11-25 14:21:18 | Tags: haplotypecaller release mutect version-highlights topstory mutect2
Comments (15)

The last GATK 3.x release of the year 2015 has arrived!

The major feature in GATK 3.5 is the eagerly awaited MuTect2 (beta version), which brings somatic SNP and Indel calling to GATK. This is just the beginning of GATK’s scope expansion into the somatic variant domain, so expect some exciting news about copy number variation in the next few weeks! Meanwhile, more on MuTect2 awesomeness below.

In addition, we’ve got all sorts of variant context annotation-related treats for you in the 3.5 goodie bag -- both new annotations and new capabilities for existing annotations, listed below.

In the variant manipulation space, we enhanced or fixed functionality in several tools including LeftAlignAndTrimVariants, FastaAlternateReferenceMaker and VariantEval modules. And in the variant calling/genotyping space, we’ve made some performance improvements across the board to HaplotypeCaller and GenotypeGVCFs (mostly by cutting out crud and making the code more efficient) including a few improvements specifically for haploids. Read the detailed release notes for more on these changes. Note that GenotypeGVCFs will now emit no-calls at sites where RGQ=0 in acknowledgment of the fact that those sites are essentially uncallable.

We’ve got good news for you if you’re the type who worries about disk space (whether by temperament or by necessity): we finally have CRAM support -- and some recommendations for keeping the output of BQSR down to reasonable file sizes, detailed below.

Finally, be sure to check out the detailed release notes for the usual variety show of minor features (including a new Queue job runner that enables local parallelism), bug fixes and deprecation notices (a few tools have been removed from the codebase, in the spirit of slimming down ahead of the holiday season).


Introducing MuTect2 (beta): calling somatic SNPs and Indels natively in GATK

MuTect2 is the next-generation somatic SNP and indel caller that combines the DREAM challenge-winning somatic genotyping engine of the original MuTect with the assembly-based machinery of HaplotypeCaller.

The original MuTect (Cibulskis et al., 2013) was built on top of the GATK engine by the Cancer Genome Analysis group at the Broad Institute, and was distributed as a separate package. By all accounts it did a great job calling somatic SNPs, and was part of the winning entries for multiple DREAM challenges (including some submitted by groups outside the Broad). However it was not able to call indels; and the less said about the indel caller that accompanied it (first named SomaticIndelDetector then Indelocator) the better.

This new incarnation of MuTect leverages much of the HaplotypeCaller’s internal machinery (including the all-important graph assembly bit) to call both SNPs and indels together. Yet it retains key parts of the original MuTect’s internal genotyping engine that allow it to model somatic variation appropriately. This is a major differentiation point compared to HaplotypeCaller, which has expectations about ploidy and allele frequencies that make it unsuitable for calling somatic variants.

As a convenience add-on to MuTect2, we also integrated the cross-sample contamination estimation tool ContEst into GATK 3.5. Note that while the previous public version of this tool relied on genotyping chip data for its operation, this version of the tool has been upgraded to enable on-the-fly genotyping for the case where genotyping data is not available. Documentation of this feature will be provided in the near future. Both MuTect2 and ContEst are now featured in the Tool Documentation section of the Guide. Stay tuned for pipeline-level documentation on performing somatic variant discovery, to be added to the Best Practices docs in the near future.

Please note that this release of MuTect2 is a beta version intended for research purposes only and should not be applied in production/clinical work. MuTect2 has not yet undergone the same degree of scrutiny and validation as the original MuTect since it is so new. Early validation results suggest that MuTect2 has a tendency to generate more false positives as compared to the original MuTect; for example, it seems to overcall somatic mutations at low allele frequencies, so for now we recommend applying post-processing filters, e.g. by hard-filtering calls with low minor allele frequencies. Rest assured that data is being generated and the tools are being improved as we speak. We’re also looking forward to feedback from you, the user community, to help us make it better faster.

Finally, note also that MuTect2 is distributed under the same restricted license as the original MuTect; for-profit users are required to seek a license to use it (please email softwarelicensing@broadinstitute.org). To be clear, while MuTect2 is released as part of GATK, the commercial licensing has not been consolidated under a single license. Therefore, current holders of a GATK license will still need to contact our licensing office if they wish to use MuTect2.


Annotate this: new and improved variant context annotations

Whew that was a long wall of text on MuTect2, wasn’t it. Let’s talk about something else now. Annotations! Not functional annotations, mind you -- we’re not talking about e.g. predicting synonymous vs. non-synonymous mutations here. I mean variant context annotations, i.e. all those statistics calculated during the variant calling process which we mostly use to estimate how confident we are that the variants are real vs. artifacts (for filtering and related purposes).

So we have two new annotations, BaseCountsBySample (what it says on the can) and ExcessHet (for excess heterozygosity, i.e. the number of heterozygote calls made in excess of the Hardy-Weinberg expectations), as well as a set of new annotations that are allele-specific versions of existing annotations (with AS_ prefix standing for Allele-Specific) which you can browse here. Right now we’re simply experimenting with these allele-specific annotations to determine what would be the best way to make use of them to improve variant filtering. In the meantime, feel free to play around with them (via e.g. VariantsToTable) and let us know if you come up with any interesting observations. Crowdsourcing is all the rage, let’s see if it gets us anywhere on this one!

We also made some improvements to the StrandAlleleCountsBySample annotation, to how VQSR handles MQ, and to how VariantAnnotator makes use of external resources -- and we fixed that annoying bug where default annotations were getting dropped. All of which you can read about in the detailed release notes.


These Three Awesome File Hacks Will Restore Your Faith In Humanity’s Ability To Free Up Some Disk Space

CRAM support! Long-awaited by many, lovingly implemented by Vadim Zalunin at EBI and colleagues at the Sanger Institute. We haven’t done extensive testing, and there are a few tickets for improvements that are planned at the htsjdk level -- but it works well enough that we’re comfortable releasing it under a beta designation. Meaning have fun with it, but do your own thorough testing before putting it into production or throwing out your old BAMs!

Static binning of base quality scores. In a nutshell, binning (or quantizing) the base qualities in a BAM file means that instead of recording all possible quality values separately, we group them into bins represented by a single value (by default, 10, 20, 30 or 40). By doing this we end up having to record fewer separate numbers, which through the magic of BAM compression yields substantially smaller files. The idea is that we don’t actually need to be able to differentiate between quality scores at a very high resolution -- if the binning scheme is set up appropriately, it doesn’t make any difference to the variant discovery process downstream. This is not a new concept, but now the GATK engine has an argument to enable binning quality scores during the base recalibration (BQSR) process using a static binning scheme that we have determined produces optimal results in our hands. The level of compression is of course adjustable if you’d like to set your own tradeoff between compression and base quality resolution. We have validated that this type of binning (with our chosen default parameters) does not have any noticeable adverse effect on germline variant discovery. However we are still looking into some possible effects on somatic variant discovery, so we can’t yet recommend binning for that application.

Disable indel quality scores. The Base Recalibration process produces indel quality scores in addition to the regular base qualities. They are stored in the BI and BD tags of the read records, taking up a substantial amount of space in the resulting BAM files. There has been a lot of discussion about whether these indel quals are worth the file size inflation. Well, we’ve done a lot of testing and we’ve now decided that no, for most use cases the indel quals don’t make enough of a difference to justify the extra file size. The one exception to this is when processing PacBio data, it seems that indel quals may help model the indel-related errors of that technology. But for the rest, we’re now comfortable recommending the use of the --disable_indel_quals argument when writing out the recalibrated BAM file with PrintReads.


Created 2015-11-25 07:10:45 | Updated 2016-01-27 17:38:33 | Tags: Promote haplotypecaller release-notes mutect gatk3 mutect2
Comments (6)

GATK 3.5 was released on November 25, 2015. Itemized changes are listed below. For more details, see the user-friendly version highlights.


New tools

  • MuTect2: somatic SNP and indel caller based on HaplotypeCaller and the original MuTect.
  • ContEst: estimation of cross-sample contamination (primarily for use in somatic variant discovery).
  • GatherBqsrReports: utility to gather recalibration tables from scatter-parallelized BaseRecalibrator runs.

Variant Context Annotations

  • Added allele-specific version of existing annotations: AS_BaseQualityRankSumTest, AS_FisherStrand, AS_MappingQualityRankSumTest, AS_RMSMappingQuality, AS_RankSumTest, AS_ReadPosRankSumTest, AS_StrandOddsRatio, AS_QualByDepth and AS_InbreedingCoeff.

  • Added BaseCountsBySample annotation. Intended to provide insight into the pileup of bases used by HaplotypeCaller in the calling process, which may differ from the pileup observed in the original bam file because of the local realignment and additional filtering performed internally by HaplotypeCaller. Can only be requested from HaplotypeCaller, not VariantAnnotator.

  • Added ExcessHet annotation. Estimates excess heterozygosity in a population of samples. Related to but distinct from InbreedingCoeff, which estimates evidence for inbreeding in a population. ExcessHet scales more reliably to large cohort sizes.

  • Added FractionInformativeReads annotation. Reports the number of reads that were considered informative by HaplotypeCaller (over all samples).

  • Enforced calculating GenotypeAnnotations before InfoFieldAnnotations. This ensures that the AD value is available to use in the QD calculation.

  • Reorganized standard annotation groups processing to ensure that all default annotations always get annotated regardless of what is specified on the command line. This fixes a bug where default annotations were getting dropped when the command line included annotation requests.

  • Made GenotypeGVCFs subset StrandAlleleCounts intelligently, i.e. subset the SAC values to the called alleles. Previously, when the StrandAlleleCountsBySample (SAC) annotation was present in GVCFs, GenotypeGVCFs carried it over to the final VCF essentially unchanged. This was problematic because SAC includes the counts for all alleles originally present (including NON-REF) even when some are not called in the final VCF. When the full list of original alleles is no longer available, parsing SAC could become difficult if not impossible.

  • Added new MQ jittering functionality to improve how VQSR handles MQ. Note that HaplotypeCaller now calculates a new annotation called RAW_MQ per-sample, which is then integrated per-cohort by GenotypeGVCFs to produce the MQ annotation.

  • VariantAnnotator can now annotate FILTER field from an external resource. Usage: --resource:foo resource.vcf --expression foo.FILTER

  • VariantAnnotator can now check allele concordance when annotating with an external resource. Usage: --resourceAlleleConcordance

  • Bug fix: The annotation framework was improved to allow for the collection of sufficient statistics during GVCF creation which are then used to compute the final annotation during the genotyping. This avoids the use of median as the representative annotation from the collection of values (one from each sample). TL;DR annotations will be more accurate when using the GVCF workflow for joint discovery.

Variant manipulation tools

  • Allowed overriding hard-coded cutoff for allele length in ValidateVariants and in LeftAlignAndTrimVariants. Usage: --reference_window_stop N where N is the desired cutoff.

  • Also in LeftAlignAndTrimVariants, trimming multiallelic alleles is now the default behavior.

  • Fixed ability to mask out snps with --snpmask in FastaAlternateReferenceMaker.

  • Also in FastaAlternateReferenceMaker, fixed merging of contiguous intervals properly, and made the tool produce more informative contig names.

  • Fixed a bug in CombineVariants that occurred when one record has a spanning deletion and needs a padded reference allele.

  • Added a new VariantEval evaluation module, MetricsCollection, that summarizes metrics from several EV modules.

  • Enabled family-level stratification in MendelianViolationEvaluator of VariantEval (if a ped file is provided), making it possible to count Mendelian violations for each family in a callset with multiple families.

  • Added the ability to SelectVariants to enforce 4.2 version output of the VCF spec when processing older files. Use case: the 4.2 spec specifies that GQ must be an integer; by default we don’t enforce it (so if reading an older file that used decimals, we don’t change it) but the new argument --forceValidOutput converts the values on request. Not made default because of some performance slowdown -- so writing VCFs is now fast by default, compliant by choice.

GVCF tools

  • Various improvements to the tools’ performance, especially HaplotypeCaller, by making the code more efficient and cutting out crud.

  • GenotypeGVCFs now emits a no-call (./.) when the evidence is too ambiguous to make a call at all (e.g. all the PLs are zero). Previously this would have led to a hom-ref call with RGQ=0.

  • Fixed a bug in GenotypeGVCFs that sometimes generated invalid VCFs for haploid callsets. The tool was carrying over the AD from alleles that had been trimmed out, causing field length mismatches.

  • Changed the genotyping implementation for haploid organisms to address performance problems reported when running GenotypeGVCFs on haploid callsets. Note that this change may lead to a slight loss of sensitivity at low-coverage sites -- let us know if you observe anything dramatic.

Genotyping engine tweaks

  • Ensured inputPriors get used if they are specified to the genotyper (previously they were ignored). Also improved docs on --heterozygosity and --indel_ heterozygosity priors.

  • Fixed bug that affected the --ignoreInputSamples behavior of CalculateGenotypePosteriors.

  • Limited emission of the scary warning message about max number of alleles (“this tool is set to genotype at most x alleles but we found more; only x will be used”) to a single occurrence unless DEBUG logging mode is activated. Otherwise it fills up our output logs.

Miscellaneous tool fixes

  • Added option to OverclippedReadFilter to not require soft-clips on both ends. Contributed by Jacob Silterra.

  • Fixed a bug in IndelRealigner where the tool was incorrectly "fixing" mates when supplementary alignments are present. The patch involves ignoring supplementary alignments.

  • Fixed a bug in CatVariants. Previously, VCF files were being sorted solely on the base pair position of the first record, ignoring the chromosome. This can become problematic when merging files from different chromosomes, especially if you have multiple VCFs per chromosome. Contributed by John Wallace.

Engine-level behaviors and capabilities

  • Support for reading and writing CRAM files. Some improvements are still expected in htsjdk. Contributed by Vadim Zalunin at EBI and collaborators at the Sanger Institute.

  • Made interval-list output format dependent on the file extension (for RealignerTargetCreator). If the extension is .interval_list, output will be formatted as a proper Picard interval list (with sequence dictionary). Otherwise it will be a basic GATK interval list as previously.

  • Adding static binning capability for base recalibration (BQSR).

Queue

  • Added a new JobRunner called ParallelShell that will run jobs locally on one node concurrently as specified by the DAG, with the option to limit the maximum number of concurrently running jobs using the flag maximumNumberOfJobsToRunConcurrently. Contributed by Johan Dahlberg.

  • Updated extension for Picard CalculateHsMetrics to include PER_TARGET_COVERAGE argument and added extension for Picard CollectWgsMetrics.

Deprecation notice

Removed:

  • BeagleOutputToVCF, VariantsToBeagleUnphased, ProduceBeagleInput. These are tools for handling Beagle data. The latest versions of Beagle support VCF input and output, so there is no longer any reason for us to provide converters.
  • ReadAdaptorTrimmer and VariantValidationAssessor. These were experimental tools which we think are not useful and not operating on a sufficiently sound basis.
  • BaseCoverageDistribution and CoveredByNSamplesSites. These tools were redundant with DiagnoseTargets and/or DepthOfCoverage.
  • LiftOverVariants, FilterLiftedVariants and liftOverVCF.pl. The Picard liftover tool LiftoverVCF works better and is easier to operate.
  • sortByRef.pl. Use Picard SortVCF instead.
  • ListAnnotations. This was intended as a utility for listing annotations easily from command line, but it has not proved useful.

Meta

  • Made various documentation improvements.
  • Updated date and street address in license text.
  • Moved htsjdk & picard to version 1.141

Created 2013-10-01 00:55:10 | Updated 2013-10-01 14:24:41 | Tags: mutect appistry webinar cancer
Comments (0)

Our partner Appistry (who distribute GATK and MuTect to commercial users) will be holding a webinar on 3 October. Registration is open to all; you can find more details on the Appistry website here:

http://www.appistry.com/news-and-events/appistry-webinar-which-mutations-matter


Created 2016-01-30 14:07:10 | Updated | Tags: mutect mutect2 germline
Comments (1)

Hello all, Being relatively new to NGS analysis techniques I recently build a pipeline for calling somatic variants in matched Tumor/Normal cancer Exome data.

I recently decided to analyze the "germline risk variants" as well. Currently I'm calling germline variants separately using (HC) and then hard-filtering using the VariantFilter tool(since I only have a few examples at the moment) and annotating the variants with Oncotator.

However, I realized that Mutect2-Oncotator output has a column titled "germline_risk", would this serve the same purpose? If so, would this mean the separate germline variant calling is useless?

Thanks beforehand for the answers, Best, -E


Created 2016-01-18 08:20:49 | Updated | Tags: mutect short-read-preprocessing
Comments (3)

Hi, I am using MuTect on paired tumor/normal exome data. MuTect pre_process low quality reads before somatic SNV discovery. How can I get the processed bam files? I know there is source code for MuTect at github, but I am not sure which java files are used in preprocessing steps. Can you please tell me the program files for low-quality reads preprocessing? Thanks very much for your time!

Best Jing


Created 2016-01-05 20:37:22 | Updated | Tags: mutect genotype vcf-file
Comments (1)

Hello,

I have noticed that when running MuTect, the variant calls in my output always have a genotype of 0/1. I have not seen any 1/1 genotypes, even when the variant allele frequency is 100%.

Are there any instances when MuTect gives a 1/1 genotype for a variant call?

Thank you, Jeremy


Created 2015-12-11 09:52:54 | Updated | Tags: mutect
Comments (1)

hello, I would like to replace MuTect with MuTect2 in my analyses pipelines but I need the information available in the extended text output from MuTect. Is there an option with MuTect2 to output the same infos?

thanks


Created 2015-11-25 00:07:21 | Updated | Tags: mutect
Comments (3)

Mutect output only single point mutations, What about t_ins_count and t_del_count,Can I use them to identify indels in a sample??

t_ins_count: count of insertion events at this locus in tumor t_del_count: count of deletion events at this locus in tumor


Created 2015-11-20 15:45:01 | Updated | Tags: mutect somatic-variants tumor-only
Comments (2)

Hi,

I recently went to the workshop for variant calling and mentioned that I would like to perform somatic variant calling with Mutect using only tumor samples (no matched normal sample). I was told that there is a pipeline under development that is not yet fully tested that you would be able to provide. Would you be able to provide this along with any other recommendations?

Thank you!


Created 2015-11-12 22:35:00 | Updated | Tags: mutect perl
Comments (2)

Before variant calling, MuTect removes low-quality reads first, please look at short read preprocessing at nature.com/nbt/journal/v31/n3/extref/nbt.2514-S1.pdf. I want to use this short read pre-processing method for my BAM files, and tried to program by perl. But I have no idea about how to program these sentences: (c) if there is an overlapping read pair, and both reads agree the read with the highest quality score is retained otherwise both are discarded. (b) if there is an overlapping read pair, and both reads agree the read with the highest quality score is retained otherwise the read that disagrees with the reference is retained. Can anybody help me to understand them? Thanks very much for help!

--best Jing


Created 2015-10-13 07:30:35 | Updated | Tags: mutect
Comments (4)

Dear mutect developers, When I using the following command to do calling snp \indel, $java -Xmx1g -jar $mutect_bin/mutect-1.1.7.jar \ --analysis_type MuTect \ --reference_sequence $GATK_ref/2.8/ucsc.hg19.fasta \ --cosmic $GATK_ref/2.8/hg19_cosmic_v54_120711.vcf \ --dbsnp $GATK_ref/2.8/dbsnp_132_b37.leftAligned.vcf \ --intervals 1:1-249250621 \ --input_file:normal $normal/chr1.merged.uniqPairs.sort.dupMarked.addGR.order.realn.bam \ --input_file:tumor $primary/chr1.merged.uniqPairs.sort.dupMarked.addGR.order.realn.bam \ --out chr1_n6Vp5_call_stats.txt \ --coverage_file chr1_n6Vp5_coverage.wig.txt

***

I got problem:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.ExceptionInInitializerError at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.(GenomeAnalysisEngine.java:167) at org.broadinstitute.sting.gatk.CommandLineExecutable.(CommandLineExecutable.java:57) at org.broadinstitute.sting.gatk.CommandLineGATK.(CommandLineGATK.java:66) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:106) Caused by: java.lang.NullPointerException at org.reflections.Reflections.scan(Reflections.java:220) at org.reflections.Reflections.scan(Reflections.java:166) at org.reflections.Reflections.(Reflections.java:94) at org.broadinstitute.sting.utils.classloader.PluginManager.(PluginManager.java:79) ... 4 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.1-0-g72492bb):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

I'm confused by that ,could you tell me how to conquer it,thank you!

Xu,ZhengZheng Beijing Institute of Genomics,Chinese Academy of Sciences


Created 2015-10-08 09:18:10 | Updated | Tags: mutect
Comments (4)

Hello, I read that there is a new version of MuTect dealing with indels coming. Would it be possible to know its release date? We are planning analyses for a new project and would like to know if we can count on MuTect2 or not.

thanks


Created 2015-09-16 09:11:43 | Updated 2015-09-16 09:13:37 | Tags: mutect gatk-protected
Comments (4)

I'm trying to install mutect, and as directed in the README.md, I've git cloned gatk-protected and tried to do 'mvn -Ddisable.queue install'. But I get the following issue. I've java 1.7 and maven 3.3.3.

[INFO] -------------------------------------------------------------
[WARNING] COMPILATION WARNING :
[INFO] -------------------------------------------------------------
[WARNING] /home/krb/Ramani/MUTECT/gatk-protected/public/gatk-framework/src/main/java/org/broadinstitute/sting/utils/threading/ThreadEfficiencyMonitor.java: Some input files use or override a deprecated API.
[WARNING] /home/krb/Ramani/MUTECT/gatk-protected/public/gatk-framework/src/main/java/org/broadinstitute/sting/utils/threading/ThreadEfficiencyMonitor.java: Recompile with -Xlint:deprecation for details.
[WARNING] /home/krb/Ramani/MUTECT/gatk-protected/public/gatk-framework/src/main/java/org/broadinstitute/sting/gatk/datasources/reads/SAMDataSource.java: Some input files use unchecked or unsafe operations.
[WARNING] /home/krb/Ramani/MUTECT/gatk-protected/public/gatk-framework/src/main/java/org/broadinstitute/sting/gatk/datasources/reads/SAMDataSource.java: Recompile with -Xlint:unchecked for details.
[WARNING] Some messages have been simplified; recompile with -Xdiags:verbose to get full output
[INFO] 5 warnings
[INFO] -------------------------------------------------------------
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /home/krb/Ramani/MUTECT/gatk-protected/public/gatk-framework/src/main/java/org/broadinstitute/sting/gatk/walkers/annotator/interfaces/AnnotationInterfaceManager.java:[129,24] no suitable method found for add(java.lang.Object)
    method java.util.Collection.add(T) is not applicable
      (argument mismatch; java.lang.Object cannot be converted to T)
    method java.util.List.add(T) is not applicable
      (argument mismatch; java.lang.Object cannot be converted to T)
[INFO] 1 error
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Sting Root ......................................... SUCCESS [  0.455 s]
[INFO] Sting Aggregator ................................... SUCCESS [  0.185 s]
[INFO] Sting GSALib ....................................... SUCCESS [  0.447 s]
[INFO] Sting Utils ........................................ SUCCESS [  0.698 s]
[INFO] GATK Framework ..................................... FAILURE [  4.181 s]
[INFO] GATK Protected ..................................... SKIPPED
[INFO] GATK Package ....................................... SKIPPED
[INFO] Sting Public ....................................... SKIPPED
[INFO] Sting Protected .................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 6.134 s
[INFO] Finished at: 2015-09-16T14:27:14+05:30
[INFO] Final Memory: 44M/1583M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (compile-java) on project gatk-framework: Compilation failure
[ERROR] /home/krb/Ramani/MUTECT/gatk-protected/public/gatk-framework/src/main/java/org/broadinstitute/sting/gatk/walkers/annotator/interfaces/AnnotationInterfaceManager.java:[129,24] no suitable method found for add(java.lang.Object)
[ERROR] method java.util.Collection.add(T) is not applicable
[ERROR] (argument mismatch; java.lang.Object cannot be converted to T)
[ERROR] method java.util.List.add(T) is not applicable
[ERROR] (argument mismatch; java.lang.Object cannot be converted to T)
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (compile-java) on project gatk-framework: Compilation failure
/home/krb/Ramani/MUTECT/gatk-protected/public/gatk-framework/src/main/java/org/broadinstitute/sting/gatk/walkers/annotator/interfaces/AnnotationInterfaceManager.java:[129,24] no suitable method found for add(java.lang.Object)
    method java.util.Collection.add(T) is not applicable
      (argument mismatch; java.lang.Object cannot be converted to T)
    method java.util.List.add(T) is not applicable
      (argument mismatch; java.lang.Object cannot be converted to T)

        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
        at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
        at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
        at org.apache.maven.cli.MavenCli.execute(MavenCli.java:862)
        at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:286)
        at org.apache.maven.cli.MavenCli.main(MavenCli.java:197)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
        at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
        at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Caused by: org.apache.maven.plugin.compiler.CompilationFailureException: Compilation failure
/home/krb/Ramani/MUTECT/gatk-protected/public/gatk-framework/src/main/java/org/broadinstitute/sting/gatk/walkers/annotator/interfaces/AnnotationInterfaceManager.java:[129,24] no suitable method found for add(java.lang.Object)
    method java.util.Collection.add(T) is not applicable
      (argument mismatch; java.lang.Object cannot be converted to T)
    method java.util.List.add(T) is not applicable
      (argument mismatch; java.lang.Object cannot be converted to T)

        at org.apache.maven.plugin.compiler.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:858)
        at org.apache.maven.plugin.compiler.CompilerMojo.execute(CompilerMojo.java:129)
        at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
        ... 20 more
[ERROR]
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :gatk-framework

I'm not able to understand how to resolve the issue. Could anybody please help me with it?


Created 2015-09-11 15:35:09 | Updated 2015-09-11 15:35:29 | Tags: intervals mutect b37
Comments (15)

I hate to put this same error on the GATK forum again, but I went through many of these errors already posted on the forum, but none of the answers shed light on my issue. I have my bam files aligned to GRCh37-lite and am using the same reference genome downloaded from ftp://ftp.ncbi.nih.gov/genbank/genomes/Eukaryotes/vertebrates_mammals/Homo_sapiens/GRCh37/special_requests

I have next performed GATK best practices for pre-processing of these bams using the same ref genome without throwing any error in the process. Currently I'm running MuTect as java -Xmx56g -jar muTect-1.1.4.jar --analysis_type MuTect --reference_sequence ./resources/b37/human_g1k_v37.fasta --cosmic ./resources/Cosmic.b37.vcf --dbsnp ./resources/dbsnp_138.b37.vcf --intervals ./resources/mirna.1.5flank-interval-list.list --input_file:normal $normal.recal_reads.bam --input_file:tumor $tumor.recal_reads.bam --out $sample.call_stats.out --coverage_file $sample.coverage.wig.txt

And getting this error message:

ERROR MESSAGE: Badly formed genome loc: Contig 'chr1' does not match any contig in the GATK sequence dictionary derived from the reference; are you sure you are using the correct reference fasta file?

What more tests should I run to troubleshoot this issue? Also, the interval list is what I created from a .bed file. I have restricted my bam files to a limited bed regions using the same file in a command "samtools view -@8 -b -h -L"

This was the file I was most confused about. Is it possible that this file is causing the error? First few lines of this file are: chr1:15869-18936 chr1:28866-32003 chr1:566205-569293 chr1:1100984-1104078 chr1:1101743-1104832 chr1:1102885-1105967 chr1:1229990-1233050 chr1:1246382-1249446 chr1:1273530-1276588 chr1:3043039-3046099 chr1:3475759-3478854 chr1:5622631-5625703 chr1:5921232-5924301 chr1:6488394-6491456 chr1:8925061-8928149 chr1:9210227-9213336 chr1:10025939-10029016 chr1:10286276-10289361 chr1:12087715-12090779

-

-

Thanks a ton for your help!


Created 2015-06-05 19:54:57 | Updated | Tags: mutect
Comments (0)

Hi Everyone,

Mutect seems to only report somatic point mutations. Is there a way to get germline mutations as well?

Thank you very much.


Created 2015-06-02 01:06:51 | Updated | Tags: mutect
Comments (0)

Hi Do you know how does MuTect manages bam files with mixed paired and unpaired reads? I ask since I notice that ver 1.7 does not report "total-pairs" anymore (now it reports "total-reads").


Created 2015-05-15 01:23:27 | Updated | Tags: mutect
Comments (3)

I'm trying to build SomaticSpike, which is included in MuTect 1.1.4. Is there any way to actually build this from the github repository? Is there perhaps a binary floating around?


Created 2015-04-30 18:56:39 | Updated | Tags: mutect
Comments (2)

Hello,

I am trying to run mutect on a mitochondrial genome, however the mitochondrial genome ploidy is variable, so I was hoping to run it as a haploid, is that possible in mutect? Any suggestions?

Thanks, Ramiro


Created 2014-12-24 00:00:35 | Updated | Tags: mutect
Comments (1)

I provided Mutect with an interval file (see below) which seems to be in a format (@SQ headers, followed by lines of chromosome number coordinates +/- cds, etc), and get the following error message:

ERROR MESSAGE: File associated with name target_intervals.reduced.withhead.interval_list.bed is malformed: Problem reading the interval file caused by

ERROR Line: @SQ SN:1 LN:249250621

The interval file has the form:

@SQ SN:chr1 LN:249250621
@SQ SN:chr2 LN:243199373
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr7 LN:159138663
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chr10 LN:135534747
@SQ SN:chr11 LN:135006516
@SQ SN:chr12 LN:133851895
@SQ SN:chr13 LN:115169878
@SQ SN:chr14 LN:107349540
@SQ SN:chr15 LN:102531392
@SQ SN:chr16 LN:90354753
@SQ SN:chr17 LN:81195210
@SQ SN:chr18 LN:78077248
@SQ SN:chr19 LN:59128983
@SQ SN:chr20 LN:63025520
@SQ SN:chr21 LN:48129895
@SQ SN:chr22 LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
chr1 66999814 67000061 + NM_032291_exon_0_10_chr1_66999825_f etc..

Are there some header lines other than @SQ ... etc that are missing, since the error message references the first line of the file?


Created 2014-12-18 19:57:32 | Updated 2014-12-18 19:58:07 | Tags: mutect variant-calling
Comments (1)

Hi,

We are working on a cancer genome project and we use Mutect to call somatic mutations. We have a question related to the high-coverage depth filtering. Below is a list of nearby positions output by Mutect where the coverage in the tumor sample is very high but the coverage in the normal sample is not.

Partial output from *call_stats.out file -

contig position t_q20_count n_q20_count failure_reasons judgement

2 89849000 622 108 KEEP

2 89850117 508 152 KEEP

2 89850498 713 732 KEEP

2 89850649 583 403 KEEP

2 89850993 849 142 KEEP

2 89872002 540 286 KEEP

2 89877959 607 259 KEEP

These positions are not filtered out by Mutect. I wonder why is a high-depth filter not used? Would it make sense to filter these variants and which high-depth threshold should we choose?

Regards,

Abhimanyu Krishna


Created 2014-12-18 00:03:25 | Updated | Tags: mutect
Comments (7)

I have been attempting to run Mutect on tumor/blood .bam files, and encounter the following error message when using Homo_sapiens.GRCh37.72.dna.fa fasta as a reference file and b37_cosmic_v54_120711.vcf as "cosmic" reference:

ERROR MESSAGE: Input files /b37_cosmic_v54_120711.vcf and reference have incompatible contigs: No overlapping contigs found.

ERROR /b37_cosmic_v54_120711.vcf contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
ERROR reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY]

Basically, it comes down to vs. chr in the cosmic vs. fasta file. Is there a simple way to work around this?


Created 2014-12-05 18:49:40 | Updated | Tags: mutect
Comments (0)

Hello. I am getting a java.lang.ArrayIndexOutOfBoundsException: 83 error when running MuTect. I got this error on v1.1.4 and built the latest version this morning (v1.1.7) but the error is still there. None of the posts about this error apply to me. It appears to be the same position always in the BAM file but I cannot determine what is wrong or how to get past it. The analysis looks like it gets through chromosomes 1, 2, and most of 3 before the crash. I have tried running using commands to ignore the error but that doesn't help. If you could point me in a direction to solve this I would be very greatful. The stack trace is pasted below.

Error processing chr3:195505766 java.lang.ArrayIndexOutOfBoundsException: 83 at org.broadinstitute.cga.tools.gatk.utils.CGAAlignmentUtils.mismatchesInRefWindow(CGAAlignmentUtils.java:135) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.filterReads(MuTect.java:787) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:511) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:79) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:107)

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.RuntimeException: java.lang.ArrayIndexOutOfBoundsException: 83 at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:649) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:79) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:107) Caused by: java.lang.ArrayIndexOutOfBoundsException: 83 at org.broadinstitute.cga.tools.gatk.utils.CGAAlignmentUtils.mismatchesInRefWindow(CGAAlignmentUtils.java:135) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.filterReads(MuTect.java:787) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:511) ... 14 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.1-0-g72492bb):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: java.lang.ArrayIndexOutOfBoundsException: 83
ERROR ------------------------------------------------------------------------------------------

[pkMyt1@CLASPIN-1 GATK]$


Created 2014-11-07 15:49:41 | Updated 2014-11-07 15:51:34 | Tags: mutect panel-of-normals
Comments (2)

Hi all,

In the 2013 Nature paper, a fixed threshold of 6.3 (LOD) is chosen for all results, which corresponds to a mutation frequency of 1-10 per Mb. Table 2 clearly shows the variety in mutation rates per cancer.

Based on this information, I have two questions.

  • Do I need to create a separate Panel of Normals per cancer type?

  • How do I correctly change the ‘--tumor_lod’ value to correspond to the mutation rate corresponding to the cancer type i use ?

Thanks in advance, Frank


Created 2014-10-22 18:55:55 | Updated 2014-10-22 19:11:09 | Tags: mutect
Comments (4)

Hi, when I looked at the call_stats output from MuTect, I am wondering does MuTect only count the reads with base phred quality > 5 in the t_ref_count, t_alt_count, n_ref_count, n_alt_count fields? (I got this conclusion by looking at the base phred quality of each read in IGV). If so, is there any way to change the threshold for the base phred quality? Thanks!


Created 2014-08-29 16:42:05 | Updated | Tags: mutect
Comments (1)

Hello, Would you please let us know why below sample field has been rejected rather than keep:


contig position ref_allele alt_allele score dbsnp_site covered power tumor_power normal_power total_pairs improper_pairs map_Q0_reads t_lod_fstar tumor_f contaminant_fraction contaminant_lod t_ref_count t_alt_count t_ref_sum t_alt_sum t_ins_count t_del_count normal_best_gt init_n_lod n_ref_count n_alt_count n_ref_sum n_alt_sum judgement chr5 112164616 C T 0 NOVEL COVERED 1 1 1 1905 168 0 670.750479 0.294294 0.02 32.630413 470 196 18315 7525 6 0 CC 218.268391 738 1 28645 37 REJECT


We appreciate all your help. Thank you


Created 2014-08-22 09:14:29 | Updated | Tags: mutect cosmic tcga pancan
Comments (1)

Hi,

I am wondering about the possibility to use TCGA variants instead of cosmic in MuTect. Given that it's mostly used to upweight positions that are in dbSNP AND cosmic I think this would be a smart move. Cosmic seems like a mess, with small single-gene experiments mixed with larger experiments. The TCGA pancan MAF seems way mor structured. My guess is that most dbSNP somatic variants are so common that they will be in TCGA pancan as well.

Has anyone tested this or have experience with it?

(my basic plan is to convert TCGA pancan MAF to VCF and use in MuTect).

cheers


Created 2014-07-23 12:33:37 | Updated | Tags: mutect strand-bias
Comments (5)

Hi, Mutect does have filter for strand bias, but does not give strand imformation(like DP4 or other ) in its output .call file or .vcf file. But sometimes I wanna check the strand distribution of the SNV called and have further filtering, I wonder how could I get such imformation?

Thanks! Hartblue


Created 2014-05-23 14:22:50 | Updated | Tags: mutect variant-calling radseq
Comments (0)

Hi, I would like to analyze a dataset consisting of RADseq (Restriction-site Associated DNA) tags from tumor and normal samples. By nature of the technique, all of the reads start at restriction enzyme cut sites in the genome - therefore the assumptions that mutations will be covered by reads from both directions and staggered with respect to position in the read are violated. Is there a way to override the strand bias and clustered position filters in the MuTect pipeline?


Created 2014-02-18 19:53:06 | Updated | Tags: mutect
Comments (1)

Hi,

I'm wondering if it is possible to use MuTect with bam files previously aligned with human_g1k_v37? In the documentation section, it is mention that we can use hg19, but what about g1K_V37?


Created 2014-02-14 10:39:55 | Updated | Tags: install mutect github git bcel
Comments (0)

Hi

I was trying to install the github version of mutect and I have some questions as well as a hope that people who had similar problems might get help from my endeavours.

I followed the instructions posted on the github page, however when I tried to build:

# build ant -Dexternal.dir='pwd'/../mutect-src -Dexecutable=mutect package

It told me I didnt have the correct bcel files in my ~/.ant/lib/:

The bcel jar can be found in the lib directory of a GATK clone after compiling, and the ant-apache-bcel jar can be downloaded from here: http://repo1.maven.org/maven2/ant/ant-apache-bcel/1.6.5/ant-apache-bcel-1.6.5.jar Please copy these two jar files to ~/.ant/lib/

I had already downloaded the ant-apache-bcel and put it there so I figured it must be the GATK clone lib. I compiled with ant dist clean but it failed and the created "lib" folder was empty. However it did create a "dist" folder and in there i found bcel-5.2.jar. I popped this in ~/.ant/lib/ and now mutect seems to build correctly using:

# build ant -Dexternal.dir='pwd'/../mutect-src -Dexecutable=mutect package

So to my questions.

  1. Is this an OK way to build it? (Can I trust the program despite unorthodox installation procedure).

  2. Howcome the mutect install instructions dont specifically mention where to find the apache bcel library (I would not have found it without the error message) and guides you to compile the gatk-protected to get the second jar file that you need? Also where to put them!?

Created 2014-01-23 20:10:26 | Updated 2014-01-23 20:24:45 | Tags: mutect error runtime-error
Comments (8)

Hello MuTect Team,

I encountered a error when running MuTect on our server for our tumor and normal pair data. If anyone can help me about his, it will be greatly appreciated.

I black out the file path by " ** " for security reason.

The tumor and normal BAM files are aligned against ucsc.hg19.fasta and all the references are using hg19 from GATK 2.8 resource bundle.

  • dbSNP: dbsnp_137.hg19.vcf
  • reference: ucsc.hg19.fasta
  • COSMIC:

    The cosmic file is generated by myself by using the following command:

    perl **/GenomeAnalysisTK-2.8-1/liftOverVCF.pl -vcf 2.8/b37_cosmic_v54_120711.vcf -chain b37tohg19.chain -out hg19_cosmic_v54.vcf -newRef ucsc.hg19 -oldRef 2.8/human_g1k_v37 -gatk **/GenomeAnalysisTK-2.8-1/
    • The liftOverVCF.pl and b37tohg19.chain are from GATK github site: https://github.com/broadgsa/gatk/tree/master/public
    • The b37_cosmic_v54_120711.vcf is from MuTect download page: http://www.broadinstitute.org/cancer/cga/mutect_download

The command I run the MuTect is:

java -jar -Xmx16g **/muTect-1.1.4/muTect-1.1.4.jar --analysis_type MuTect --reference_sequence **/ucsc.hg19.fasta --cosmic **/hg19_cosmic_v54.vcf --dbsnp **/dbsnp_137.hg19.vcf --input_file:tumor **/ReduceReads_P1T.bam --input_file:normal **/ReduceReads_P1N.bam --out **/MuTect_P1.out --coverage_file MT_coverage_P1.txt

The log message is as follows:

INFO 13:04:40,315 HelpFormatter - --------------------------------------------------------------------------------- INFO 13:04:40,317 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.2-25-g2a68eab, Compiled 2012/11/08 10:30:02 INFO 13:04:40,317 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 13:04:40,317 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 13:04:40,321 HelpFormatter - Program Args: --analysis_type MuTect --reference_sequence /ucsc.hg19.fasta --cosmic /hg19_cosmic_v54.vcf --dbsnp /dbsnp_137.hg19.vcf --input_file:tumor /ReduceReads_P1T.bam --input_file:normal /ReduceReads_P1N.bam --out /MuTect_P1.out --coverage_file MT_coverage_P1.txt INFO 13:04:40,321 HelpFormatter - Date/Time: 2014/01/14 13:04:40 INFO 13:04:40,321 HelpFormatter - --------------------------------------------------------------------------------- INFO 13:04:40,321 HelpFormatter - --------------------------------------------------------------------------------- INFO 13:04:40,341 ArgumentTypeDescriptor - Dynamically determined type of /dbsnp_137.hg19.vcf to be VCF INFO 13:04:40,346 ArgumentTypeDescriptor - Dynamically determined type of /hg19_cosmic_v54.vcf to be VCF INFO 13:04:40,353 GenomeAnalysisEngine - Strictness is SILENT INFO 13:04:40,414 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE Target Coverage: 1000
INFO 13:04:40,420 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 13:04:40,449 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.03 INFO 13:04:40,466 RMDTrackBuilder - Loading Tribble index from disk for file /dbsnp_137.hg19.vcf INFO 13:04:40,585 RMDTrackBuilder - Loading Tribble index from disk for file /hg19_cosmic_v54.vcf INFO 13:04:40,643 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 13:04:40,643 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining Error processing chrM:117 org.apache.commons.math.FunctionEvaluationException: Cumulative probability function returned NaN for argument 0.975 p = 0.975 at org.apache.commons.math.distribution.AbstractContinuousDistribution$1.value(AbstractContinuousDistribution.java:107) at org.apache.commons.math.analysis.solvers.BrentSolver.solve(BrentSolver.java:388) at org.apache.commons.math.analysis.solvers.BrentSolver.solve(BrentSolver.java:250) at org.apache.commons.math.analysis.solvers.UnivariateRealSolverUtils.solve(UnivariateRealSolverUtils.java:82) at org.apache.commons.math.distribution.AbstractContinuousDistribution.inverseCumulativeProbability(AbstractContinuousDistribution.java:138) at org.apache.commons.math.distribution.BetaDistributionImpl.inverseCumulativeProbability(BetaDistributionImpl.java:176) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:454) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:34) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:287) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:252) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93) INFO 13:04:42,505 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused INFO 13:04:42,506 HttpMethodDirector - Retrying request INFO 13:04:42,511 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused INFO 13:04:42,511 HttpMethodDirector - Retrying request INFO 13:04:42,515 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused INFO 13:04:42,516 HttpMethodDirector - Retrying request INFO 13:04:42,518 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused INFO 13:04:42,519 HttpMethodDirector - Retrying request INFO 13:04:42,523 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused INFO 13:04:42,524 HttpMethodDirector - Retrying request

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.RuntimeException: org.apache.commons.math.FunctionEvaluationException: Cumulative probability function returned NaN for argument 0.975 p = 0.975 at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:712) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:34) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:287) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:252) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93) Caused by: org.apache.commons.math.FunctionEvaluationException: Cumulative probability function returned NaN for argument 0.975 p = 0.975 at org.apache.commons.math.distribution.AbstractContinuousDistribution$1.value(AbstractContinuousDistribution.java:107) at org.apache.commons.math.analysis.solvers.BrentSolver.solve(BrentSolver.java:388) at org.apache.commons.math.analysis.solvers.BrentSolver.solve(BrentSolver.java:250) at org.apache.commons.math.analysis.solvers.UnivariateRealSolverUtils.solve(UnivariateRealSolverUtils.java:82) at org.apache.commons.math.distribution.AbstractContinuousDistribution.inverseCumulativeProbability(AbstractContinuousDistribution.java:138) at org.apache.commons.math.distribution.BetaDistributionImpl.inverseCumulativeProbability(BetaDistributionImpl.java:176) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:454) ... 14 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.2-25-g2a68eab):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: org.apache.commons.math.FunctionEvaluationException: Cumulative probability function returned NaN for argument 0.975 p = 0.975
ERROR ------------------------------------------------------------------------------------------

Created 2013-12-16 18:42:29 | Updated 2013-12-16 19:32:00 | Tags: mutect
Comments (0)

Hello, I've been running 3 different versions of MuTect on a set of tumor and matched-normal samples. Although all three versions produce comparable counts of somatic mutations, only the pre-release version (muTect-1.0.27783) returns anything with the 'KEEP' status. The other 2 versions ( muTect-1.1.1 and muTect-1.1.4) reject ALL returned somatic variants. I'm running all 3 versions of the software with default parameters. I was wondering if you could tell me what's different between the 3 releases of MuTect and why the later versions seem to be more conservative.

Thank you,

Alina


Created 2013-12-10 09:03:20 | Updated | Tags: mutect
Comments (0)

I used the following command -

pps/jdk/1.6.0_25/bin/java -Xmx2g -jar muTect-1.1.4.jar -T MuTect --reference_sequence /home/exome/repository/ref_genomes/human_g1k_v37.fasta --cosmic $HOME/b37_cosmic_v54_120711.vcf --dbsnp reference_files/dbsnp_132_b37.leftAligned.vcf --input_file:tumor reference_files/S0343/S0343_novoalign.bam --input_file:normal reference_files/S0345/S0345_novoalign.bam --out S0345_S0343.out --vcf S0345_S0343.out.vcf

Weirdly no variants were found past chromosome 5. This has also occurred with the other samples I used. I know this is incorrect as variants were found by varscan in other chromosomes. I am not sure why this is happening? If you have a suggestion I would greatly appreciate it. Thank you.


Created 2013-11-26 01:16:34 | Updated | Tags: mutect trimming
Comments (0)

Hi,

GATK 2.7 does not require quality trimming anymore as the tools take base qualities into account. Is it OK to use these untrimmed reads with Mutect as well? Should I expect a big difference comparing the Mutect calls of trimmed and untrimmed versions of the raw reads?


Created 2013-11-08 17:09:46 | Updated | Tags: mutect
Comments (0)

Dear Mutect developers, I am trying to figure out the column names for the output of muTect-1.1.1.jar I tried using the information in http://www.broadinstitute.org/cancer/cga/mutect_run but there appear to be more columns than what is listed in the guide. Apologies if someone else asked this already. Thanks, Juan


Created 2013-11-05 19:32:18 | Updated | Tags: install mutect bug error
Comments (1)

Hello,

I am attempting to install MuTect using the instructions on GitHub at https://github.com/broadinstitute/mutect.

At the last step, building with Ant, I get the following error:

[xslt] Processing /.../mutect-src/mutect/mutect.xml to /.../gatk-protected/dist/packages/mutect.xml
[xslt] Loading stylesheet /.../gatk-protected/public/packages/CreatePackager.xsl
[xslt] /.../gatk-protected/public/packages/CreatePackager.xsl:15:42: Error! xsl:output is not allowed in this position in the stylesheet!

BUILD FAILED
/.../gatk-protected/build.xml:945: The following error occurred while executing this line:
/.../gatk-protected/dist/packages/mutect.xml:1: Premature end of file.

After looking through the build.xml file, I am wondering if this is a bug in the stylesheet.

$JAVA_HOME/bin/java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)

$ANT_HOME/bin/ant -version
Apache Ant(TM) version 1.9.2 compiled on July 8 2013

java -jar /.../gatk-protected/dist/GenomeAnalysisTK.jar -version
2.7-1-g42d771f`

As you can see, GATK is building just fine, and it seems to be a problem with MuTect.

Thanks for any help!


Created 2013-10-31 15:41:54 | Updated | Tags: mutect mouse cosmic
Comments (0)

According to the documentation, "there is no cosmic VCF available for mouse, this entire parameter can be eliminated". Is that still the official recommendation? Is there now perhaps some other comparable resource that one could use?


Created 2013-10-29 15:38:14 | Updated | Tags: mutect
Comments (5)

Hi, When I run mutect I get the following error.

Commad:

java -Xmx100g -jar muTect-1.1.1.jar --analysis_type MuTect --reference_sequence hs37d5.fa --input_file:tumor sample2.bam --input_file:normal sample1.bam

Error processing 1:897790 java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeLo(TimSort.java:747) at java.util.TimSort.mergeAt(TimSort.java:483) at java.util.TimSort.mergeCollapse(TimSort.java:410) at java.util.TimSort.sort(TimSort.java:214) at java.util.TimSort.sort(TimSort.java:173) at java.util.Arrays.sort(Arrays.java:659) at java.util.Collections.sort(Collections.java:217) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:471) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:32) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:168) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:156) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:229) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:200) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:44) at org.broadinstitute.sting.gatk.traversals.TraverseLociBase.traverse(TraverseLociBase.java:61) at org.broadinstitute.sting.gatk.traversals.TraverseLociBase.traverse(TraverseLociBase.java:16) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:73) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:277) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93) INFO 11:35:20,285 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.RuntimeException: java.lang.IllegalArgumentException: Comparison method violates its general contract! at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:714) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:32) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:168) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:156) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:229) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:200) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:44) at org.broadinstitute.sting.gatk.traversals.TraverseLociBase.traverse(TraverseLociBase.java:61) at org.broadinstitute.sting.gatk.traversals.TraverseLociBase.traverse(TraverseLociBase.java:16) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:73) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:277) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93) Caused by: java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeLo(TimSort.java:747) at java.util.TimSort.mergeAt(TimSort.java:483) at java.util.TimSort.mergeCollapse(TimSort.java:410) at java.util.TimSort.sort(TimSort.java:214) at java.util.TimSort.sort(TimSort.java:173) at java.util.Arrays.sort(Arrays.java:659) at java.util.Collections.sort(Collections.java:217) at org.broadinstitute.cga.tools.gatk.walkers.cancer.mutect.MuTect.map(MuTect.java:471) ... 14 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.1-202-g2fe6a31):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: java.lang.IllegalArgumentException: Comparison method violates its general contract!
ERROR ------------------------------------------------------------------------------------------

Created 2013-10-01 18:24:10 | Updated 2013-10-01 21:19:55 | Tags: mutect interval
Comments (2)

I had a problem when running MuTect. It runs perfectly when I specify, for example,

--intervals  chr1:14363-14829

But when I do two intervals

--intervals  chr1:14363-14829;chr1:14970-15038

an error message was shown :

##### ERROR MESSAGE: Walker requires reads but none were provided.
##### ERROR ------------------------------------------------------------------------------------------
chr1:14970-15038: Command not found.

Also, when I use a interval file with all the intervals I want to analyze, MuTect tells me

WARN  11:18:30,579 IntervalUtils - The interval file /panfs/cmb-panasas2/junsongz/exon_project/ref/intervals.txt contains no intervals that could be parsed. 

The intervals.txt has format like this:

chr1:14363-14829
chr1:14970-15038
chr1:15796-15947
chr1:16607-16765
chr1:16858-17055
chr1:17233-17368
chr1:17606-17742
chr1:69091-70008

I think I followed http://www.broadinstitute.org/cancer/cga/mutect_run exactly, but could not figure out what is wrong here.


Created 2013-09-26 23:57:44 | Updated | Tags: mutect
Comments (0)

I only have BAM files for normal samples and I want to run Mutect on the normal samples and get the LOD scores. I have tried the following

  1. Run the normal bam as if it is tumor and check the t_lod_fstar score
  2. Run the normal bam as if it is tumor and use another normal bam as normal, and check the t_lod_fstar score
  3. Run the other normal bam as tumor and the normal bam of interest as normal, and check init_n_lod score

Is any of the above approach able to give the correct score that is the same as running this normal bam with the matched tumor?

Thanks!


Created 2013-09-21 10:39:01 | Updated 2013-09-24 14:44:04 | Tags: mutect
Comments (2)

when i input

java -Xmx2g -jar muTect-1.1.4.jar --analysis_type MuTect --reference_sequence human_g1k_v37.fasta --dbsnp dbsnp_132_b37.leftAligned.vcf --cosmic b37_cosmic_v54_120711.vcf --intervals 21:7577100-7577200 --input_file:normal CEUTrio.HiSeq.WEx.b37_decoy.NA12891.clean.dedup.recal.20120117.bam --input_file:tumor CEUTrio.HiSeq.WEx.b37_decoy.NA12878.clean.dedup.recal.20120117.bam --out example.call_stats.txt --coverage_file example.coverage.wig.txt

it didn't occur a error,but the result is

NSRuntimeProfile - Input   time:    0.0 s ( 0.66%) 
INFO  18:25:46,687 NSRuntimeProfile - Map     time:    0.0 s ( 0.04%) 
INFO  18:25:46,688 NSRuntimeProfile - Reduce  time:    0.0 s ( 0.01%) 
INFO  18:25:46,688 NSRuntimeProfile - Outside time:    1.6 s (99.29%) 
INFO  18:25:51,099 GATKRunReport - Uploaded run statistics report to AWS S3 

what's the meanning?


Created 2013-07-25 06:16:13 | Updated | Tags: mutect
Comments (1)

Gatk MuTect page mentions "We currently use cutoffs of at least 14 reads in the tumor and at least 8 in the normal" My question is how can I change these values. Are there any specific arguments that can be employed? Specifically, I would like to reduce these values as some of my exome samples contain very poorly covered regions but with good base and mapping quality, in IGV these mutations can be seen but MuTect REJECTs them. (I can risk increased false positives so that is not a concern).

And also, is it possible to tell MuTect not to take into consideration strand bias ?

Thanks


Created 2013-04-11 22:33:51 | Updated | Tags: mutect java malformedvcf
Comments (6)

In addition to the standard mutect output, I'm interested in vcf output, and was happy to find
a previous related question showing how to output vcf. However, I seem to be having some
trouble with what I think is misformed output. Specifically, the genotype field is "0" for
normal and "0/1" for tumor on every line

#CHROM  POS       ID           REF  ALT  QUAL  FILTER  INFO               FORMAT             normal  tumor  
7       55230840  rs7781264    A    G    .     REJECT  DB                 GT:AD:BQ:DP:FA     0:0,1:.:1:1.00                0/1:0,27:29:27:1.00           
7       55233109  rs150899403  G    A    .     PASS    DB;SOMATIC;VT=SNP  GT:AD:BQ:DP:FA:SS  0:134,0:.:132:0.00:0          0/1:380,296:24:688:0.438:2    
7       55233265  .            A    C    .     REJECT  .                  GT:AD:BQ:DP:FA     0:6,0:.:6:0.00                0/1:251,24:12:275:0.087       

The corresponding lines in the mutect output file are

## muTector v1.0.47986
contig  position  context  ref_allele  alt_allele  tumor_name   normal_name   score  dbsnp_site    covered    power     tumor_power  normal_power  total_pairs  improper_pairs  map_Q0_reads  t_lod_fstar  tumor_f   contaminant_fraction  contaminant_lod  t_ref_count  t_alt_count  t_ref_sum  t_alt_sum  t_ref_max_mapq  t_alt_max_mapq  t_ins_count  t_del_count  normal_best_gt  init_n_lod   n_ref_count  n_alt_count  n_ref_sum  n_alt_sum  judgement
7       55230840  ACTxTGC  A           G           tumor        normal        0      DBSNP         UNCOVERED  0         0.612407     0             30           1               0             93.647278    1         0.02                  -0.236654        0            27           0          808        0               70              0            0            GG              -3.882263    0            1            0          30         REJECT 
7       55233109  TGTxCCA  G           A           tumor        normal        0      DBSNP+COSMIC  COVERED    1         1            1             1140         3               7             665.64967    0.43787   0.02                  28.941878        380          296          10901      7263       70              70              0            0            GG              40.298595    134          0            4097       0          KEEP   
7       55233265  CCCxCAG  A           C           tumor        normal        0      NOVEL         UNCOVERED  0         1            0             305          5               0             8.112745     0.087273  0.02                  2.430961         251          24           6289       302        70              70              0            0            AA              1.803681     6            0            154        0          REJECT 

If it matters, this was with openjdk 1.6:

$ /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.5) (rhel-1.50.1.11.5.el6_3-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

Any idea what might be causing this, and is there anything you or I can do to fix it?

Thanks, Kevin


Created 2013-02-20 16:06:16 | Updated | Tags: bundle mutect dbsnp cosmic
Comments (55)

I'm having trouble finding the recommended COSMIC and dbSNP file for hg19 to use with MuTect (hg19_cosmic_v54_120711.vcf and dbsnp_132_b37.leftAligned.vcf). I can't find these in any of the bundles on the GATK public FTP site. I see a dbSNP file called dbsnp_132_b37.vcf; is this the same? I don't see any COSMIC file at all. I'm currently using bundle 2.3 for hg19 for the dbSNP files (and the standard indels from 1000G and Mills for indel realignment). Thanks!