Error in VariantsToVCF
Posted in Ask the GATK team | Last updated on


Comments (7)

Hi

I am trying to covert the UCSC format of SNP to VCF format. I downloaded dbSNP128.txt.gz from (UCSC http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/snp128.txt.gz). The command I used is java -jar GenomeAnalysisTK.jar -R mm9.fa -T VariantsToVCF --variant:OLDDBSNP dbSNP128.txt -o dbsnp128.vcf

Error message:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: G at org.broadinstitute.sting.utils.variantcontext.VariantContext.makeAlleles(VariantContext.java:1307) at org.broadinstitute.sting.utils.variantcontext.VariantContext.(VariantContext.java:290) at org.broadinstitute.sting.utils.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:495) at org.broadinstitute.sting.gatk.refdata.VariantContextAdaptors$DBSnpAdaptor.convert(VariantContextAdaptors.java:206) at org.broadinstitute.sting.gatk.refdata.VariantContextAdaptors.toVariantContext(VariantContextAdaptors.java:64) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.getVariantContexts(VariantsToVCF.java:177) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.map(VariantsToVCF.java:123) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.map(VariantsToVCF.java:83) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:248) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:219) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:94)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.3-4-g57ea19f):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Duplicate allele added to VariantContext: G
ERROR ------------------------------------------------------------------------------------------

According to this post (http://gatkforums.broadinstitute.org/discussion/1275/error-in-haplotype-caller), "Duplicate allele added to VariantContext" error may be caused by lower case bases in the reference. I converted all my reference sequences to upper case letters, but GATK still reports the same error message.

Thanks in advance.


Return to top Comment on this article in the forum