Walks along all variant ROD loci, caching a user-defined window of VariantContext sites, and then finishes phasing them when they go out of range (using upstream and downstream reads).
Performs physical phasing of SNP calls, based on sequencing reads.
VCF file of SNP calls, BAM file of sequence reads.
Phased VCF file.
java -jar GenomeAnalysisTK.jar -T ReadBackedPhasing -R reference.fasta -I reads.bam --variant SNPs.vcf -L SNPs.vcf -o phased_SNPs.vcf --phaseQualityThresh 20.0
These Read Filters are automatically applied to the data by the Engine before processing by ReadBackedPhasing.
This tool applies the following downsampling settings by default.
The arguments described in the entries below can be supplied to this tool to modify its behavior. For example, the -L argument directs the GATK engine restricts processing to specific genomic intervals (this is an Engine capability and is therefore available to all GATK walkers).
This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.
|Argument name(s)||Default value||Summary|
|NA||Input VCF file|
|stdout||File to which variants should be written|
|20000||The window size (in bases) to cache variant sites and their reads for the phasing procedure|
|1||The maximum reference-genome distance between consecutive heterozygous sites to permit merging phased VCF records into a MNP record|
|10||The maximum number of successive heterozygous sites permitted to be used by the phasing algorithm|
|17||Minimum base quality required to consider a base for phasing|
|20||Minimum read mapping quality required to consider a read for phasing|
|20.0||The minimum phasing quality score required to output phasing|
||NA||Only include these samples when phasing|
||false||If specified, print out very verbose debug information (if -l DEBUG is also specified)|
|false||Merge consecutive phased sites into MNP records|
Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.
The window size (in bases) to cache variant sites and their reads for the phasing procedure
Integer 20000 [ [ -? ? ] ]
If specified, print out very verbose debug information (if -l DEBUG is also specified)
Merge consecutive phased sites into MNP records
The maximum reference-genome distance between consecutive heterozygous sites to permit merging phased VCF records into a MNP record
int 1 [ [ -? ? ] ]
The maximum number of successive heterozygous sites permitted to be used by the phasing algorithm
Integer 10 [ [ -? ? ] ]
Minimum base quality required to consider a base for phasing
int 17 [ [ -? ? ] ]
Minimum read mapping quality required to consider a read for phasing
int 20 [ [ -? ? ] ]
File to which variants should be written
The minimum phasing quality score required to output phasing
Double 20.0 [ [ -? ? ] ]
Only include these samples when phasing
Input VCF file
Variants from this VCF file are used by this tool as input. The file must at least contain the standard VCF header lines, but can be empty (i.e., no variants are contained in the file).
GATK version 3.0-0-g6bad1c6 built at 2014/03/06 06:38:04. GTD: NA