Hey, GATK team,
I downloaded snpEff and browsed the database by inputting the commands as $ java -jar snpEff.jar databases
It provides a list of database of all available species including mouse's as follows (GRCm38.73 )
However, there is no database of GRCm38 under the actual databases at all.
Did I missed something or database is still under construction? Where can I download the mouse database? I need to do functional annotation of my variant VCF file.
In addition, according to the presentation materials in GATK, we will do variant filter/selectvariant right after annotation, or it does not matter which one goes first?
Could you tell me how to encourage GATK to annotate my genotype columns (i.e. add annotations to the FORMAT and PANC_R columns in the following file):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT PANC_R chrX 259221 . GA G 136.74 . AC=2;AF=1.00;AN=2;DP=15;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=8.82;MQ0=1;QD=3.04 GT:AD:GQ:PL 1/1:0,2:6:164,6,0
The file was generated with HaplotypeCaller. I used a command line similar to this one to no effect:
java -jar $GATKROOT/GenomeAnalysisTK.jarT VariantAnnotator -R hg19_random.fa -I chr7_recalibrated.bam -V chr7.vcf --dbsnpdbSNP135_chr.vcf -A Coverage -A QualByDepth -A FisherStrand -A MappingQualityRankSumTest -A ReadPosRankSumTest -o chr7_annotated-again.vcf
Does anyone have any suggestions? Thanks in advance!
I have a set of VCFs with identical positions in them:
VCF1: 1 10097 . T . 26 . AN=196;DP=1622;MQ=20.06;MQ0=456 GT:DP
VCF2: 1 10097 . T . 21.34 . AN=198;DP=2338;MQ=19.53;MQ0=633 GT:DP
VCF3: 1 10097 . T . 11.70 . AN=240;DP=3957;MQ=19.74;MQ0=1085 GT:DP
VCF4: 1 10097 . T . 15.56 . AN=134;DP=1348;MQ=18.22;MQ0=442 GT:DP
If I use all of them as input for VariantRecalibrator, which annotations will VariantRecalibrator use? Should I instead merge the VCFs with CombineVariants and run VariantAnnotator, before I run VariantRecalibrator?
I'm not sure if the forum is for asking technical questions only or you are allowed to ask for best practices as well. Feel free to delete my question, if it doesn't belong here. Thank you.
I'm attempting to use Variant Annotator to annotate some VCFs produced by samtools so I can run VQSR on them. Unfortunately I've gottent stuck and I'm trying to figure out why Variant Annotator wouldn't be annotating INDELs with MappingQualityRankSumTest and ReadPosRankSumTest, it seems to annotate SNPs fine. There are both Homs and het's called on the sample. Could it be I need to left align the indels to get enough coverage? What would you suggest is the best way to debug this? Is there a way to make GATK behave more verbosely about why it's refusing an annotation?