Analyze Annotations

From GSA

Jump to: navigation, search

Warning: the material on this page is considered out of date by the GSA team.


Previous users of Analyze Annotations are encouraged to check out Variant quality score recalibration

Introduction

In order to create appropriate call set filters using the various variant annotations it is often quite useful to visualize the quality of a call set as a function of each annotation in turn. One can then easily decide to filter out variants with AB > 0.75, for example, because one sees the Ti/Tv ratio of those variants fall off dramatically. This tool can also be used to show concordance with truth sets as a function of the annotation values.

Using the tool

java -Xmx4g -jar GenomeAnalysisTK.jar \
   -R /seq/references/Homo_sapiens_assembly18/v0/Homo_sapiens_assembly18.fasta \
   -B:input,VCF path/to/input/snpCalls.vcf \
   -B:input2,VCF path/to/another/input/snpCalls.vcf \
   -B:truthSet,VCF path/to/truthSet.vcf \
   -l INFO \
   -output analyzeAnnotations/ \
   -resources resources/ \
   --min_variants_per_bin 1000 \
   --max_variants_per_bin 20000 \
   -name DP,Depth \
   -name AB,AlleleBalance \
   -T AnalyzeAnnotations

The tool accepts any number of input call sets and any number of truth sets. Simply make sure when passing in truth sets using the -B command line argument that the name the ROD is being bound to begins with "truth", as in "truthSet" in the example command above.

For those running this tool externally from the Broad, it is crucial to note that both the -Rscript and -resources options must be changed from the default. -Rscript needs to point to your installation of R while -resources needs to point to the folder holding the R scripts that are used.

Example annotation plots

Personal tools