No discussions found with tag ApplyRecalibration

ApplyRecalibration

Applies cuts to the input vcf file (by adding filter lines) to achieve the desired novel truth sensitivity levels which were specified during VariantRecalibration

Category Variant Discovery Tools

Traversal LocusWalker

PartitionBy LOCUS


Overview

Using the tranche file generated by the previous step the ApplyRecalibration walker looks at each variant's VQSLOD value and decides which tranche it falls in. Variants in tranches that fall below the specified truth sensitivity filter level have their filter field annotated with its tranche level. This will result in a call set that simultaneously is filtered to the desired level but also has the information necessary to pull out more variants for a higher sensitivity but a slightly lower quality level.

Input

The input raw variants to be recalibrated.

The recalibration table file in VCF format that was generated by the VariantRecalibrator walker.

The tranches file that was generated by the VariantRecalibrator walker.

Output

A recalibrated VCF file in which each variant is annotated with its VQSLOD and filtered if the score is below the desired quality level.

Examples

 java -Xmx3g -jar GenomeAnalysisTK.jar \
   -T ApplyRecalibration \
   -R reference/human_g1k_v37.fasta \
   -input NA12878.HiSeq.WGS.bwa.cleaned.raw.subset.b37.vcf \
   --ts_filter_level 99.0 \
   -tranchesFile path/to/output.tranches \
   -recalFile path/to/output.recal \
   -mode SNP \
   -o path/to/output.recalibrated.filtered.vcf
 

Additional Information

Read filters

These Read Filters are automatically applied to the data by the Engine before processing by ApplyRecalibration.

Parallelism options

This tool can be run in multi-threaded mode using this option.


Command-line Arguments

Inherited arguments

The arguments described in the entries below can be supplied to this tool to modify its behavior. For example, the -L argument directs the GATK engine restricts processing to specific genomic intervals (this is an Engine capability and is therefore available to all GATK walkers).

ApplyRecalibration specific arguments

This table summarizes the command-line arguments that are specific to this tool. For details, see the list further down below the table.

Name Type Default value Summary
Required
--input List[RodBinding[VariantContext]] NA The raw input variants to be recalibrated
--recal_file RodBinding[VariantContext] NA The input recal file used by ApplyRecalibration
--tranches_file File NA The input tranches file describing where to cut the data
Optional
--ignore_filter String[] NA If specified the variant recalibrator will use variants even if the specified filter name is marked in the input VCF file
--mode Mode SNP Recalibration mode to employ: 1.) SNP for recalibrating only SNPs (emitting indels untouched in the output VCF); 2.) INDEL for indels; and 3.) BOTH for recalibrating both SNPs and indels simultaneously.
--out VariantContextWriter stdout The output filtered and recalibrated VCF file in which each variant is annotated with its VQSLOD value
--ts_filter_level double 99.0 The truth sensitivity level at which to start filtering

Argument details

Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.

--ignore_filter / -ignoreFilter ( String[] )

If specified the variant recalibrator will use variants even if the specified filter name is marked in the input VCF file.

--input / -input ( required List[RodBinding[VariantContext]] )

The raw input variants to be recalibrated. These calls should be unfiltered and annotated with the error covariates that are intended to use for modeling. --input binds reference ordered data. This argument supports ROD files of the following types: BCF2, VCF, VCF3

--mode / -mode ( Mode with default value SNP )

Recalibration mode to employ: 1.) SNP for recalibrating only SNPs (emitting indels untouched in the output VCF); 2.) INDEL for indels; and 3.) BOTH for recalibrating both SNPs and indels simultaneously..
The --mode argument is an enumerated type (Mode), which can have one of the following values:

SNP
INDEL
BOTH

--out / -o ( VariantContextWriter with default value stdout )

The output filtered and recalibrated VCF file in which each variant is annotated with its VQSLOD value.

--recal_file / -recalFile ( required RodBinding[VariantContext] )

The input recal file used by ApplyRecalibration. --recal_file binds reference ordered data. This argument supports ROD files of the following types: BCF2, VCF, VCF3

--tranches_file / -tranchesFile ( required File )

The input tranches file describing where to cut the data.

--ts_filter_level / -ts_filter_level ( double with default value 99.0 )

The truth sensitivity level at which to start filtering.


See also Guide Index | Technical Documentation Index | Support Forum

GATK version 2.5-2-gdb4546e built at 2013/05/01 09:32:36.