Does UnifiedGenotyper use the first rs it finds at a given position ?
Posted in Ask the team | Last updated on


Comments (1)

Hi the GATK team;

I use the UnifiedGenotyper the following way:

java -jar GenomeAnalysisTK-2.1-13-g1706365/GenomeAnalysisTK.jar \
        -R /human_g1k_v37.fasta \
        -T UnifiedGenotyper \
        -glm BOTH \
        -S SILENT \
         -L ../align/capture.bed  \
         -I  myl.bam  \
        --dbsnp broadinstitute.org/bundle/1.5/b37/dbsnp_135.b37.vcf.gz \
        -o output.vcf 

When I look at the generated VCF , the variation 18:55997929 (CTTCT/C) is said to be rs4149608

18 55997929 rs4149608 CTTCT C (...)

but in the dbsnp_135.b37.vcf.gz, you can see that the right rs## should be rs144384654

$ gunzip -c broadinstitute.org/bundle/1.5/b37/dbsnp_135.b37.vcf.gz |grep -E -w '(rs4149608|rs144384654)'
18 55997929 rs4149608 CT C,CTTCT (...)
18 55997929 rs144384654 CTTCT C (...)

does UnifiedGenotyper uses the first rs## it finds at a given position ? Or should I use another method/tool to get the 'right' rs## ?

Thank you,

Pierre


Return to top Comment on this article in the forum