Annotates a validation (from Sequenom for example) VCF with QC metrics (HW-equilibrium, % failed probes)
The Variant Validation Assessor is a tool for vetting/assessing validation data (containing genotypes). The tool produces a VCF that is annotated with information pertaining to plate quality control and by default is soft-filtered by high no-call rate or low Hardy-Weinberg probability. If you have .ped files, please first convert them to VCF format.
A validation VCF to annotate.
An annotated VCF. Additionally, a table like the following will be output:
Total number of samples assayed: 185 Total number of records processed: 152 Number of Hardy-Weinberg violations: 34 (22%) Number of no-call violations: 12 (7%) Number of homozygous variant violations: 0 (0%) Number of records passing all filters: 106 (69%) Number of passing records that are polymorphic: 98 (92%)
java -Xmx2g -jar GenomeAnalysisTK.jar \ -R ref.fasta \ -T VariantValidationAssessor \ --variant input.vcf \ -o output.vcf
These Read Filters are automatically applied to the data by the Engine before processing by VariantValidationAssessor.
This tool uses a sliding window on the reference.
The arguments described in the entries below can be supplied to this tool to modify its behavior. For example, the -L argument directs the GATK engine restricts processing to specific genomic intervals (this is an Engine capability and is therefore available to all GATK walkers).
This table summarizes the command-line arguments that are specific to this tool. For details, see the list further down below the table.
|--variant||RodBinding[VariantContext]||NA||Input VCF file|
|--maxHardy||double||20.0||Maximum phred-scaled Hardy-Weinberg violation pvalue to consider an assay valid|
|--maxHomVar||double||1.1||Maximum homozygous variant rate (as a fraction) to consider an assay valid|
|--maxNoCall||double||0.05||Maximum no-call rate (as a fraction) to consider an assay valid|
|--out||VariantContextWriter||stdout||File to which variants should be written|
Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.
Maximum phred-scaled Hardy-Weinberg violation pvalue to consider an assay valid.
Maximum homozygous variant rate (as a fraction) to consider an assay valid. To disable, set to a value greater than 1.
Maximum no-call rate (as a fraction) to consider an assay valid. To disable, set to a value greater than 1.
File to which variants should be written.
Input VCF file. Variants from this VCF file are used by this tool as input. The file must at least contain the standard VCF header lines, but can be empty (i.e., no variants are contained in the file). --variant binds reference ordered data. This argument supports ROD files of the following types: BCF2, VCF, VCF3
GATK version 2.5-2-gdb4546e built at 2013/05/01 09:32:36.