perform UnifiedGenotyper for empty vcf file
Posted in Ask the GATK team | Last updated on 2012-11-27 14:42:45


Comments (6)

Hi, I use bowtie to do the mapping. And then use samtools to convert the sam to bam file, then use Picard to add RG info, then use the GATK local indel tool, then use the GATK UnifiedGenotyper to call SNP.

java -Xmx4g -jar ~/fly/GenomeAnalysisTK-2.2-3/GenomeAnalysisTK.jar -T UnifiedGenotyper -I read_target.bam --dbsnp ../human_chr.vcf -R ~/fly/GenomeAnalysisTK-2.2-3/ucsc.hg19.fasta -o read.vcf -stand_call_conf 50.0 -stand_emit_conf 10.0 -dcov 1000

And all the progress, there is no error. But I just got a vcf file as following.

##fileformat=VCFv4.1
##FILTER=<ID=LowQual,Description="Low quality">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP Membership">
...
action_per_sample=0.25 indel_heterozygosity=1.25E-4 indelGapContinuationPenalty=10 indelGapOpenPenalty=45 indelHaplotypeSize=80 noBandedIndel=false indelDebug=false ignoreSNPAlleles=false allReadsSP=false ignoreLaneInfo=false reference_sample_calls=(RodBinding name= source=UNBOUND) reference_sample_name=null sample_ploidy=2 min_quality_score=1 max_quality_score=40 site_quality_prior=20 min_power_threshold_for_calling=0.95 min_reference_depth=100 exclude_filtered_reference_sites=false heterozygosity=0.001 genotyping_mode=DISCOVERY output_mode=EMIT_VARIANTS_ONLY standard_min_confidence_threshold_for_calling=50.0 standard_min_confidence_threshold_for_emitting=10.0 alleles=(RodBinding name= source=UNBOUND) max_alternate_alleles=3 dbsnp=(RodBinding name=dbsnp source=exampleSNP.vcf) comp=[] out=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub no_cmdline_in_header=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub sites_only=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub bcf=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub debug_file=null metrics_file=null annotation=[] excludeAnnotation=[] filter_mismatching_base_and_quals=false"
##contig=<ID=chr1,length=100000>
##reference=file:///home/fly//GenomeAnalysisTK-2.2-3/ucsc.hg19.fasta
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  read.bam

I don't know why, but I am sure it can't be right with no SNPs. Through the samtools pileup method, I can get about 100 thousand SNPs.

Hope to get your answer, thanks.


Return to top Comment on this article in the forum