When I run CombineVariants on two vcf files with variants at the same position but with different REF alleles and also different sets of ALT alleles, the AD fields for the genotypes are not updated to reflect the changes in the ALT field. The REF and ALT fields and the GT field for each genotype are all correctly updated. For example combining
3 10128965 rs71052293 CTT CT,C,CTTT 19936.43 PASS AC=1,1,1;AF=0.25,0.25,0.25;AN=4 GT:AD:DP:GQ:PL 0/2:115,0,33,12:230:6.96:980,1237,2795,0,946,1900,7,679,467,817 3/1:97,13,20,16:229:99:804,221,832,581,176,3047,521,0,1653,1595
and
3 10128965 rs71052293 CT C,CTT,CTTT 14280.61 PASS AC=1,1,1;AF=0.25,0.25,0.25;AN=4 GT:AD:DP:GQ:PL 2/1:110,20,33,18:237:1.90:850,289,1027,457,0,1487,147,877,2,1858 0/3:80,48,5,29:209:99:1835,875,977,2101,1119,3322,0,142,331,462
gives
3 10128965 rs71052293 CTT CT,C,CTTT,CTTTT 19936.43 PASS AC=2,1,2,1;AF=0.250,0.125,0.250,0.125;AN=8;set=Intersection GT:AD:DP:GQ 0/2:115,0,33,12:230:7 3/1:97,13,20,16:229:99 3/1:110,20,33,18:237:2 0/4:80,48,5,29:209:99
There five alleles (one REF and four ALT) but only four AD fields for each genotype.
My command line:
java -jar -Xmx4g GenomeAnalysisTK.jar -T CombineVariants -R human_g1k_v37.fasta -V test_input1.vcf -V test_input2.vcf -o test_combined.vcf
Is this a known limitation or a bug?