AnalyzeCovariates

Tool to analyze and evaluate base recalibration ables.

Category Diagnostics and Quality Control Tools

Traversal LocusWalker

PartitionBy LOCUS


Overview

For now it generates a plot report to assess the quality of a recalibration.

Input

The tool can take up to three different sets of recalibration tables. The resulting plots will be overlaid on top of each other to make comparisons easy.
SetArgumentLabelColorDescription
Original-beforeBEFOREMaroon1 First pass recalibration tables obtained from applying {@link BaseRecalibration} on the original alignment.
Recalibrated-afterAFTERBlue Second pass recalibration tables results from the application of {@link BaseRecalibration} on the alignment recalibrated using the first pass tables
Input-BQSRBQSRBlack Any recalibration table without a specific role

You need to specify one set at least. Multiple sets need to have the same values for the following parameters:

covariate (order is not important), no_standard_covs, run_without_dbsnp, solid_recal_mode, solid_nocall_strategy, mismatches_context_size, mismatches_default_quality, deletions_default_quality, insertions_default_quality, maximum_cycle_value, low_quality_tail, default_platform, force_platform, quantizing_levels and binary_tag_name

Output

Currently this tool generates two outputs:
-plots my-report.pdf
A pdf document that encloses plots to assess the quality of the recalibration.
-csv my-report.csv
A csv file that contains a table with all the data required to generate those plots.
You need to specify at least one of them.

Other Arguments

-ignoreLMT, --ignoreLastModificationTimes

when set, no warning message will be displayed in the -before recalibration table file is older than the -after one.

Examples

Plot a single recalibration table

 java -jar GenomeAnalysisTK.jar \
      -T AnalyzeCovariates \
      -R myrefernce.fasta \
      -BQSR myrecal.table \
      -plots BQSR.pdf
 

Plot before (first pass) and after (second pass) recalibration table to compare them

 java -jar GenomeAnalysisTK.jar \
      -T AnalyzeCovariates \
      -R myrefernce.fasta \
      -before recal2.table \
      -after recal3.table \
      -plots recalQC.pdf
 

Plot up to three recalibration tables for comparison


 # You can ignore the before/after semantics completely if you like (if you do add -ignoreLMT
 # to avoid a possible warning), but all tables should have been generated using the same parameters.

 java -jar GenomeAnalysisTK.jar \
      -T AnalyzeCovariates \
      -R myrefernce.fasta \
      -ignoreLMT \
      -BQSR recal1.table \   # you can discard any two
      -before recal2.table \
      -after recal3.table \
      -plots myrecals.pdf
 

Full BQSR quality assessment pipeline

 # Generate the first pass recalibration table file.
 java -jar GenomeAnalysisTK.jar \
      -T BaseRecalibrator \
      -R myreference.fasta \
      -I myinput.bam \
      -knownSites bundle/my-trusted-snps.vcf \ # optional but recommendable
      -knownSites bundle/my-trusted-indels.vcf \ # optional but recommendable
      ... other options
      -o firstpass.table

 # Generate the second pass recalibration table file.
 java -jar GenomeAnalysisTK.jar \
      -T BaseRecalibrator \
      -BQSR firstpass.table \
      -R myreference.fasta \
      -I myinput.bam \
      -knownSites bundle/my-trusted-snps.vcf \
      -knownSites bundle/my-trusted-indels.vcf \
      ... other options \
      -o secondpass.table

 # Finally generate the plots report and also keep a copy of the csv (optional).
 java -jar GenomeAnalysisTK.jar \
      -T AnalyzeCovariates \
      -R myrefernce.fasta \
      -before firstpass.table \
      -after secondpass.table \
      -csv BQSR.csv \ # optional
      -plots BQSR.pdf
 

Additional Information

Read filters

These Read Filters are automatically applied to the data by the Engine before processing by AnalyzeCovariates.

Downsampling settings

This tool applies the following downsampling settings by default.

  • Mode: BY_SAMPLE
  • To coverage: 1,000

Command-line Arguments

Inherited arguments

The arguments described in the entries below can be supplied to this tool to modify its behavior. For example, the -L argument directs the GATK engine restricts processing to specific genomic intervals (this is an Engine capability and is therefore available to all GATK walkers).

AnalyzeCovariates specific arguments

This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.

Argument name(s) Default value Summary
Optional Inputs
--afterReportFile
 -after
NA file containing the BQSR second-pass report file
--beforeReportFile
 -before
NA file containing the BQSR first-pass report file
Optional Outputs
--intermediateCsvFile
 -csv
NA location of the csv intermediate file
--plotsReportFile
 -plots
NA location of the output report
Optional Flags
--ignoreLastModificationTimes
 -ignoreLMT
false do not emit warning messages related to suspicious last modification time order of inputs

Argument details

Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.


--afterReportFile / -after

file containing the BQSR second-pass report file
File containing the recalibration tables from the second pass.

File


--beforeReportFile / -before

file containing the BQSR first-pass report file
File containing the recalibration tables from the first pass.

File


--ignoreLastModificationTimes / -ignoreLMT

do not emit warning messages related to suspicious last modification time order of inputs
If true, it won't show a warning if the last-modification time of the before and after input files suggest that they have been reversed.

boolean  false


--intermediateCsvFile / -csv

location of the csv intermediate file
Output csv file name.

File


--plotsReportFile / -plots

location of the output report
Output report file name.

File


See also Guide Index | Tool Documentation Index | Support Forum

GATK version 3.1-1-g07a4bf8 built at 2014/03/18 07:00:36. GTD: NA