No posts found Could not load requested forum posts.

CoveredByNSamplesSites

Print intervals file with all the variant sites for which most of the samples have good coverage

Category Diagnostics and Quality Control Tools

Traversal LocusWalker

PartitionBy LOCUS


Overview

CoveredByNSamplesSites is a GATK tool for filtering out sites based on their coverage. The sites that pass the filter are printed out to an intervals file. See argument defaults for what constitutes "most" samples and "good" coverage. These parameters can be modified from the command line.

Input

A variant file and optionally min coverage and sample percentage values.

Output

An intervals file.

Example

 java -Xmx2g -jar GenomeAnalysisTK.jar \
   -R ref.fasta \
   -T CoveredByNSamplesSites \
   -V input.vcf \
   -out output.intervals \
   -minCov 15
 

Additional Information

Read filters

These Read Filters are automatically applied to the data by the Engine before processing by CoveredByNSamplesSites.

Parallelism options

This tool can be run in multi-threaded mode using this option.

Downsampling settings

This tool applies the following downsampling settings by default.

  • Mode: BY_SAMPLE
  • To coverage: 1,000

Command-line Arguments

Inherited arguments

The arguments described in the entries below can be supplied to this tool to modify its behavior. For example, the -L argument directs the GATK engine restricts processing to specific genomic intervals (this is an Engine capability and is therefore available to all GATK walkers).

CoveredByNSamplesSites specific arguments

This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.

Argument name(s) Default value Summary
Required Inputs
--variant
 -V
NA Input VCF file
Optional Outputs
--OutputIntervals
 -out
stdout Name of file for output intervals
Optional Parameters
--minCoverage
 -minCov
10 only samples that have coverage bigger than minCoverage will be counted
--percentageOfSamples
 -percentage
0.9 only sites where at least percentageOfSamples of the samples have good coverage, will be emitted

Argument details

Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.


--minCoverage / -minCov

only samples that have coverage bigger than minCoverage will be counted

int  10  [ [ -?  ? ] ]


--OutputIntervals / -out

Name of file for output intervals

PrintStream  stdout


--percentageOfSamples / -percentage

only sites where at least percentageOfSamples of the samples have good coverage, will be emitted

double  0.9  [ [ -?  ? ] ]


--variant / -V

Input VCF file
Variants from this VCF file are used by this tool as input. The file must at least contain the standard VCF header lines, but can be empty (i.e., no variants are contained in the file).

--variant binds reference ordered data. This argument supports ROD files of the following types: BCF2, VCF, VCF3

R RodBinding[VariantContext]


See also Guide Index | Tool Documentation Index | Support Forum

GATK version 3.1-1-g07a4bf8 built at 2014/03/18 07:00:36. GTD: NA