Tagged with #variantstovcf
1 documentation article | 0 announcements | 7 forum discussions


Comments (19)

A new tool has been released!

Check out the documentation at VariantsToVCF.

No posts found with the requested search criteria.
Comments (2)

I’d like to convert a hapmap file to vcf. The hapmap file is from http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/latest/forward/non-redundant/genotypes_chr1_ASW_r27_nr.b36_fwd.txt.gz

A few questions about the following command at http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_VariantsToVCF.html#--dbsnp

java -Xmx2g -jar GenomeAnalysisTK.jar \
   -R ref.fasta \
   -T VariantsToVCF \
   -o output.vcf \
   --variant:RawHapMap input.hapmap \
   --dbsnp dbsnp.vcf
  1. Since the hapmap is in reference genome b36, should the ref.fasta be b36 as well? While b37 is everywhere, the only place I can find b36 is b36.3 at ftp://igenome:G3nom3s4u@ussd-ftp.illumina.com/Homo_sapiens/NCBI/build36.3/Homo_sapiens_NCBI_build36.3.tar.gz, is this OK?
  2. What’s the usage of “-–dbsnp” here, should it be dbSNP built upon b36 as well?
  3. How do I use the codec at http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_hapmap_RawHapMapCodec.html, is it already built in VariantsToVCF or I have to download a codec file somewhere?

Thanks,

Comments (5)

Hi,

I have annotated my vcf file of 20 samples from Unified genotyper using the following steps.

Unified genotyper->Variantrecalibration->Applyrecalibration->VariantAnnotator

My question is how should I proceed if I have to select rare variants (MAF<1%) for the candidate genes that I have,for each of these 20 samples?

Comments (6)

Hi,

i have SNPs from another SNP caller which does not produce vcf files so i wanted to convert those using the VariantsToVCF walker. I used the following command:

java -Xmx2g -jar  /PATH/TO/GenomeAnalysisTK.jar -T VariantsToVCF -R reference.fa -o mySNPs.vcf --variant mySNPs.bed

GATK version 2.7-4

i took a bed file as input which had the following format:

chr1    895 896
chr1    941 942
chr1    1096    1097
...

The walker runs without any error message but i get an empty vcf file (0B, nothing) apart from a mySNPs.vcf.idx and a mySNPs.bed.idx file.

Any idea how to fix that?

Thanks a lot!!!

Best wishes, Karin

Comments (1)

I have used the following command $ java -jar GenomeAnalysisTK.jar -T VariantsToVCF -V:OLDDBSNP snp138.txt -R genome.fa -o snp138.vcf to convert human snp138.txt file downloaded from http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/ to .vcf format. As output the snp138.vcf file has been generated which has the size as 0 and the other 2 files have been generated as snp138.txt.idx and snp138.vcf.idx. How can i get the snp138.vcf?

Comments (1)

I am trying to format files for input into the BaseRecalibrator and VariantsToVcf tools. Many of the links for file formats listed on the 'variant' option do not work (http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_VariantsToVCF.html#--variant). I get the following message: **Not Found

The requested URL /gatk/gatkdocs/org_broad_tribble_bed_BEDCodec.html was not found on this server.**

Do you know what the problem is?

Comments (3)

java -Xmx2g -jar /share/ClusterShare/software/contrib/gi/gatk/2.6-4-g3e5ff60/GenomeAnalysisTK-2.6-4-g3e5ff60/GenomeAnalysisTK.jar -R /share/ClusterShare/biodata/contrib/gi/gatk-resource-bundle/2.3/hg19/ucsc.hg19.fasta -T VariantsToVCF -o bedtovcf.vcf --variant:BED single.bed --dbsnp /share/ClusterShare/biodata/contrib/gi/gatk-resource-bundle/2.3/hg19/dbsnp_137.hg19.vcf -L chr1

single.bed

chr1 10153 10154 1

The output is empty

Comments (7)

Hi

I am trying to covert the UCSC format of SNP to VCF format. I downloaded dbSNP128.txt.gz from (UCSC http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/snp128.txt.gz). The command I used is java -jar GenomeAnalysisTK.jar -R mm9.fa -T VariantsToVCF --variant:OLDDBSNP dbSNP128.txt -o dbsnp128.vcf

Error message:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: G at org.broadinstitute.sting.utils.variantcontext.VariantContext.makeAlleles(VariantContext.java:1307) at org.broadinstitute.sting.utils.variantcontext.VariantContext.(VariantContext.java:290) at org.broadinstitute.sting.utils.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:495) at org.broadinstitute.sting.gatk.refdata.VariantContextAdaptors$DBSnpAdaptor.convert(VariantContextAdaptors.java:206) at org.broadinstitute.sting.gatk.refdata.VariantContextAdaptors.toVariantContext(VariantContextAdaptors.java:64) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.getVariantContexts(VariantsToVCF.java:177) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.map(VariantsToVCF.java:123) at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToVCF.map(VariantsToVCF.java:83) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:248) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:219) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:94)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.3-4-g57ea19f):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Duplicate allele added to VariantContext: G
ERROR ------------------------------------------------------------------------------------------

According to this post (http://gatkforums.broadinstitute.org/discussion/1275/error-in-haplotype-caller), "Duplicate allele added to VariantContext" error may be caused by lower case bases in the reference. I converted all my reference sequences to upper case letters, but GATK still reports the same error message.

Thanks in advance.