Applies cuts to the input vcf file (by adding filter lines) to achieve the desired novel truth sensitivity levels which were specified during VariantRecalibration
Using the tranche file generated by the previous step the ApplyRecalibration walker looks at each variant's VQSLOD value and decides which tranche it falls in. Variants in tranches that fall below the specified truth sensitivity filter level have their filter field annotated with its tranche level. This will result in a call set that simultaneously is filtered to the desired level but also has the information necessary to pull out more variants for a higher sensitivity but a slightly lower quality level.
The input raw variants to be recalibrated.
The recalibration table file in VCF format that was generated by the VariantRecalibrator walker.
The tranches file that was generated by the VariantRecalibrator walker.
A recalibrated VCF file in which each variant is annotated with its VQSLOD and filtered if the score is below the desired quality level.
java -Xmx3g -jar GenomeAnalysisTK.jar \ -T ApplyRecalibration \ -R reference/human_g1k_v37.fasta \ -input NA12878.HiSeq.WGS.bwa.cleaned.raw.subset.b37.vcf \ --ts_filter_level 99.0 \ -tranchesFile path/to/output.tranches \ -recalFile path/to/output.recal \ -mode SNP \ -o path/to/output.recalibrated.filtered.vcf
These Read Filters are automatically applied to the data by the Engine before processing by ApplyRecalibration.
This tool can be run in multi-threaded mode using this option.
The arguments described in the entries below can be supplied to this tool to modify its behavior. For example, the -L argument directs the GATK engine restricts processing to specific genomic intervals (this is an Engine capability and is therefore available to all GATK walkers).
This table summarizes the command-line arguments that are specific to this tool. For details, see the list further down below the table.
|--input||List[RodBinding[VariantContext]]||NA||The raw input variants to be recalibrated|
|--recal_file||RodBinding[VariantContext]||NA||The input recal file used by ApplyRecalibration|
|--tranches_file||File||NA||The input tranches file describing where to cut the data|
|--ignore_filter||String||NA||If specified the variant recalibrator will use variants even if the specified filter name is marked in the input VCF file|
|--mode||Mode||SNP||Recalibration mode to employ: 1.) SNP for recalibrating only SNPs (emitting indels untouched in the output VCF); 2.) INDEL for indels; and 3.) BOTH for recalibrating both SNPs and indels simultaneously.|
|--out||VariantContextWriter||stdout||The output filtered and recalibrated VCF file in which each variant is annotated with its VQSLOD value|
|--ts_filter_level||double||99.0||The truth sensitivity level at which to start filtering|
Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.
If specified the variant recalibrator will use variants even if the specified filter name is marked in the input VCF file.
The raw input variants to be recalibrated. These calls should be unfiltered and annotated with the error covariates that are intended to use for modeling. --input binds reference ordered data. This argument supports ROD files of the following types: BCF2, VCF, VCF3
Recalibration mode to employ: 1.) SNP for recalibrating only SNPs (emitting indels untouched in the output VCF); 2.) INDEL for indels; and 3.) BOTH for recalibrating both SNPs and indels simultaneously..
The --mode argument is an enumerated type (Mode), which can have one of the following values:
The output filtered and recalibrated VCF file in which each variant is annotated with its VQSLOD value.
The input tranches file describing where to cut the data.
The truth sensitivity level at which to start filtering.
GATK version 2.5-2-gdb4546e built at 2013/05/01 09:32:36.