I have bam files downloaded from the 1000 Genomes Project and I need to have fasta files as the reference files (for -R option) in order to turn my bams into vcf files. Can you please tell me where to find fasta files from for my data? I tried using samtools and picard, but I don't really know where to get files to convert them to fasta..
I’d like to convert a hapmap file to vcf. The hapmap file is from http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/latest/forward/non-redundant/genotypes_chr1_ASW_r27_nr.b36_fwd.txt.gz
A few questions about the following command at http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_VariantsToVCF.html#--dbsnp
java -Xmx2g -jar GenomeAnalysisTK.jar \ -R ref.fasta \ -T VariantsToVCF \ -o output.vcf \ --variant:RawHapMap input.hapmap \ --dbsnp dbsnp.vcf
I have annotated my vcf file of 20 samples from Unified genotyper using the following steps.
My question is how should I proceed if I have to select rare variants (MAF<1%) for the candidate genes that I have,for each of these 20 samples?
i have SNPs from another SNP caller which does not produce vcf files so i wanted to convert those using the VariantsToVCF walker. I used the following command:
java -Xmx2g -jar /PATH/TO/GenomeAnalysisTK.jar -T VariantsToVCF -R reference.fa -o mySNPs.vcf --variant mySNPs.bed
GATK version 2.7-4
i took a bed file as input which had the following format:
chr1 895 896 chr1 941 942 chr1 1096 1097 ...
The walker runs without any error message but i get an empty vcf file (0B, nothing) apart from a mySNPs.vcf.idx and a mySNPs.bed.idx file.
Any idea how to fix that?
Thanks a lot!!!
Best wishes, Karin
I have used the following command $ java -jar GenomeAnalysisTK.jar -T VariantsToVCF -V:OLDDBSNP snp138.txt -R genome.fa -o snp138.vcf to convert human snp138.txt file downloaded from http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/ to .vcf format. As output the snp138.vcf file has been generated which has the size as 0 and the other 2 files have been generated as snp138.txt.idx and snp138.vcf.idx. How can i get the snp138.vcf?
I am trying to format files for input into the BaseRecalibrator and VariantsToVcf tools. Many of the links for file formats listed on the 'variant' option do not work (http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_variantutils_VariantsToVCF.html#--variant). I get the following message: **Not Found
The requested URL /gatk/gatkdocs/org_broad_tribble_bed_BEDCodec.html was not found on this server.**
Do you know what the problem is?
java -Xmx2g -jar /share/ClusterShare/software/contrib/gi/gatk/2.6-4-g3e5ff60/GenomeAnalysisTK-2.6-4-g3e5ff60/GenomeAnalysisTK.jar -R /share/ClusterShare/biodata/contrib/gi/gatk-resource-bundle/2.3/hg19/ucsc.hg19.fasta -T VariantsToVCF -o bedtovcf.vcf --variant:BED single.bed --dbsnp /share/ClusterShare/biodata/contrib/gi/gatk-resource-bundle/2.3/hg19/dbsnp_137.hg19.vcf -L chr1
chr1 10153 10154 1
The output is empty
I am trying to covert the UCSC format of SNP to VCF format. I downloaded dbSNP128.txt.gz from (UCSC http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/snp128.txt.gz). The command I used is
java -jar GenomeAnalysisTK.jar -R mm9.fa -T VariantsToVCF --variant:OLDDBSNP dbSNP128.txt -o dbsnp128.vcf
java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: G
According to this post (http://gatkforums.broadinstitute.org/discussion/1275/error-in-haplotype-caller), "Duplicate allele added to VariantContext" error may be caused by lower case bases in the reference. I converted all my reference sequences to upper case letters, but GATK still reports the same error message.
Thanks in advance.