Tagged with #mappingqualityfilter
1 documentation article | 0 announcements | 5 forum discussions



Created 2012-07-23 23:56:05 | Updated 2012-07-23 23:56:05 | Tags: mappingqualityfilter gatkdocs
Comments (0)

A new tool has been released!

Check out the documentation at MappingQualityFilter.

No posts found with the requested search criteria.

Created 2014-10-08 07:06:57 | Updated | Tags: unifiedgenotyper mappingqualityfilter
Comments (6)

Hi I have a vcf that was generated using unified genotyper using output-mode EMIT_ALL_SITES. Several positions in the vcf with ALT as "." have QUAL as "." which I understand as "Reference" with unknown Quality. Howrever, FILTER for these is set to PASS.I am wondering how this is possible? Does this mean that Unified Genotyper did not print a QUAL even though it was score high enough to get it to PASS?

I am pasting some parts of the vcf below. Any help is appreciated.

##fileformat=VCFv4.0
##FILTER=<ID=LowQual,Description="Low quality">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth (only filtered reads used for calling)">
##FORMAT=<ID=GQ,Number=1,Type=Float,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=PL,Number=3,Type=Float,Description="Normalized, Phred-scaled likelihoods for AA,AB,BB genotypes where A=ref and B=alt; not applicable if site is not biallelic">
##INFO=<ID=AC,Number=.,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP Membership">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Filtered Depth">
##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?">
##INFO=<ID=Dels,Number=1,Type=Float,Description="Fraction of Reads Containing Spanning Deletions">
##INFO=<ID=FS,Number=1,Type=Float,Description="Phred-scaled p-value using Fisher's exact test to detect strand bias">
##INFO=<ID=HRun,Number=1,Type=Integer,Description="Largest Contiguous Homopolymer Run of Variant Allele In Either Direction">
##INFO=<ID=HaplotypeScore,Number=1,Type=Float,Description="Consistency of the site with at most two segregating haplotypes">
##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">
##INFO=<ID=MQ0,Number=1,Type=Integer,Description="Total Mapping Quality Zero Reads">
##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
##INFO=<ID=QD,Number=1,Type=Float,Description="Variant Confidence/Quality by Depth">
##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
##INFO=<ID=SB,Number=1,Type=Float,Description="Strand Bias">
##UnifiedGenotyper="analysis_type=UnifiedGenotyper input_file=[x.bam] sample_metadata=[] read_buffer_size=n
ull phone_home=STANDARD read_filter=[] intervals=[x.bed] excludeIntervals=null reference_sequence=hg19.fasta rodBind=[dbsnp_132.hg19.vcf] rodToIntervalTrackName=null BTI_merge_rule=UNION nonDeterministicRandomSeed=false DBSNP=null downsampling_type=null downs
ample_to_fraction=null downsample_to_coverage=null baq=CALCULATE_AS_NECESSARY baqGapOpenPenalty=40.0 performanceLog=null useOriginalQualities=false defaultBaseQualities=-1 validation_strictness=SIL
ENT unsafe=null num_threads=1 interval_merging=ALL read_group_black_list=null processingTracker=null restartProcessingTracker=false processingTrackerStatusFile=null processingTrackerID=-1 allow_int
ervals_with_unindexed_bam=false disable_experimental_low_memory_sharding=false logging_level=INFO log_to_file=null help=false genotype_likelihoods_model=BOTH p_nonref_model=EXACT heterozygosity=0.0
010 pcr_error_rate=1.0E-4 genotyping_mode=DISCOVERY output_mode=EMIT_ALL_SITES standard_min_confidence_threshold_for_calling=50.0 standard_min_confidence_threshold_for_emitting=10.0 noSLOD=false as
sume_single_sample_reads=null abort_at_too_much_coverage=-1 min_base_quality_score=17 min_mapping_quality_score=20 max_deletion_fraction=0.05 min_indel_count_for_genotyping=5 indel_heterozygosity=1
.25E-4 indelGapContinuationPenalty=10.0 indelGapOpenPenalty=45.0 indelHaplotypeSize=80 doContextDependentGapPenalties=true getGapPenaltiesFromData=false indel_recal_file=indel.recal_data.csv indelD
ebug=false dovit=false GSA_PRODUCTION_ONLY=false exactCalculation=LINEAR_EXPERIMENTAL ignoreSNPAlleles=false output_all_callable_bases=false genotype=false out=org.broadinstitute.sting.gatk.io.stub
s.VCFWriterStub NO_HEADER=org.broadinstitute.sting.gatk.io.stubs.VCFWriterStub sites_only=org.broadinstitute.sting.gatk.io.stubs.VCFWriterStub debug_file=null metrics_file=null annotation=[DepthOfC
overage, RMSMappingQuality]"
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  YH1
chr1    14468   .       G       .       .       PASS    DP=110;HaplotypeScore=0.0000;MQ=2.10;MQ0=104    GT      ./.
chr1    14469   .       C       .       .       PASS    DP=109;HaplotypeScore=0.0000;MQ=2.11;MQ0=103    GT      ./.
chr1    14470   .       G       .       .       PASS    DP=105;HaplotypeScore=0.0000;MQ=2.15;MQ0=99     GT      ./.
chr1    14471   .       C       .       .       PASS    DP=103;HaplotypeScore=0.0000;MQ=2.17;MQ0=97     GT      ./.
chr1    14472   .       A       .       .       PASS    DP=106;HaplotypeScore=0.0000;MQ=2.14;MQ0=100    GT      ./.
chr1    14473   .       G       .       .       PASS    DP=103;HaplotypeScore=0.0000;MQ=2.17;MQ0=97     GT      ./.
chr1    14474   .       G       .       .       PASS    DP=98;HaplotypeScore=0.0000;MQ=2.23;MQ0=92      GT      ./.
chr1    14553   .       C       .       .       PASS    DP=98;HaplotypeScore=0.0000;MQ=2.33;MQ0=94      GT      ./.
chr1    14554   .       G       .       .       PASS    DP=99;HaplotypeScore=0.0000;MQ=2.32;MQ0=95      GT      ./.
chr1    14555   .       C       .       .       PASS    DP=101;HaplotypeScore=0.0000;MQ=3.17;MQ0=96     GT      ./.
chr1    14556   .       T       .       32.99   LowQual AC=0;AF=0.00;AN=2;DP=101;MQ=3.17;MQ0=96 GT:DP:GQ:PL     0/0:101:3:0,3,27
chr1    14557   .       C       .       32.99   LowQual AC=0;AF=0.00;AN=2;DP=102;MQ=3.15;MQ0=97 GT:DP:GQ:PL     0/0:102:3:0,3,27
....
chr1    14587   .       T       .       35.99   LowQual AC=0;AF=0.00;AN=2;DP=100;MQ=4.41;MQ0=90 GT:DP:GQ:PL     0/0:100:6:0,6,51
chr1    14640   .       C       .       50.96   PASS    AC=0;AF=0.00;AN=2;DP=123;MQ=5.73;MQ0=107        GT:DP:GQ:PL     0/0:123:20.97:0,21,174
chr1    14641   .       A       .       50.96   PASS    AC=0;AF=0.00;AN=2;DP=123;MQ=5.84;MQ0=106        GT:DP:GQ:PL     0/0:123:20.97:0,21,174

Created 2014-03-13 10:30:49 | Updated | Tags: mappingqualityfilter
Comments (1)

I want to filter out the reads with lower mapping quality, but I can't find args? Which args I should setting and I should filter the low mapping quality reads in which step will be better? I see the follow link has been described about this, but I can find the args in haplotypecaller,IndelRealigner,BaseRecalibrator and other tools.

http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_filters_MappingQualityFilter.html#--min_mapping_quality_score http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_haplotypecaller_HCMappingQualityFilter.html

Could you please help me out with this problem?


Created 2012-10-11 21:54:39 | Updated 2012-10-18 01:33:56 | Tags: unifiedgenotyper badmatefilter mappingqualityfilter
Comments (13)

It was a bit unclear what the BadMateFilter is doing in the documentation. Any information would be appreciated!


Created 2012-08-21 16:20:00 | Updated 2013-01-07 20:45:30 | Tags: mappingqualityfilter
Comments (5)

Dear all,

I fail retrieving variant calling (.vcf ) using the GATK2.0, although with a similar example it works well. I compared both and I find a difference in Mapping Quality (the mapping quality of the example that works has 60 whereas the other has 255 -this last one is performed using bfast and gives this quality-)

Googling, I already find that this could be caused because of GATK doesn't take into account qualities of 255. Is it true? http://www.biostars.org/post/show/43540/gatk-baq-and-dbsnp-option-in-countvariates/ (Note than this solution affects to GATK1.4 and I am using the GATK2.0)

I also checked the reference genome was ok (and also the .bed file with the exom position).

I repeat the process changing manually the "255" to "60" with the same result.

Any ideas of what could be the problem?

The executed command :

java -jar /home/public/biotools/GATK_2.0/GenomeAnalysisTK.jar -T UnifiedGenotyper -R /home/public/mnt/cubix/public/biodata/hg19_norm/hg19.fa -L /home/public/mnt/cubix/public/biodata/hg19_norm/bed/all_captured_exomes_hg19.bed -I /home/public/test/outer_test/intermediate/AUT143_chr15_extract_cpy_sorted.bam -glm SNP -nt 4 -o /home/public/test/outer_test/intermediate/AUT143_chr15_extract_cpy_sorted.bam.vcf

$ samtools view AUT143_chr15_extract_cpy_sorted.bam | head 1629_189_1658_F3 0 15 20002423 255 2I48M * 0 0 TATCCAAATATCCACTTGCAGATTCCACGAAAATAGTGTTTCAAAACTGC 5:=59B98@>!!4<8C?72!!!!92::!!1373!!00!!-00!!.!!!!% XA:i:2 MD:Z:48 XE:Z:-----------3--------300-----1-----2---2----1--0-0- PG:Z:bfast RG:Z:sample IH:i:1 NH:i:1 HI:i:1 CM:i:10 NM:i:2 CQ:Z:!7B90A;/:A@<%>.>?5'.:?7>+/=)9'&22%%%&%(%%1&/%(/%%% OQ:Z:ARZ`SR!!LUU]E>!!!!RCUO!!6AM@!!44!!3?@!!6!!!!% AS:i:925 CS:Z:T03320100333301120131300020111200032211200211000203 1396_1160_190_F3 16 15 20006225 255 50M * 0 0 CTCAATCTAAAGATAGGTTCAACTCTCTGAGATGAGTGCACACATCACAA !!!!!!6/206B1!!!!38!!!!!2535:D41426372@A?86!!!!!!! XA:i:2 MD:Z:26G4T18 XE:Z:0-310--------2-3---2-00--------------------13-2-2- PG:Z:bfast RG:Z:sample IH:i:1 NH:i:1 HI:i:1 CM:i:13 NM:i:2 CQ:Z:!'''1+5-1:<A5%4.+''%8@+%1))'?%,?%1&%8%<<-.018:-%)- OQ:Z:!!!!!!RJGDRJ!!!!?M!!!!!;C?9TF57;BKBC__TG!!!!!!! AS:i:475 CS:Z:T02121311111131122132221222200022013223220032201320 .... (I also try to change manually the 255 value in .sam file (and I added 60) with no values ...)

Thanks,


Created 2012-08-02 17:07:55 | Updated 2012-08-02 17:07:55 | Tags: unifiedgenotyper mappingqualityfilter
Comments (5)

HI GATK - I am still using the GenomeAnalysisTK-1.6-5-g557da77 version for UnifiedGenotyper. This is probably a silly question, but is there a way to set a parameter for minimum mapping quality score for reads, in deciding whether to evaluate them for variant detection. I know there is a --min_base_quality_score parameter, but I don't see on for mapping quality. http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_gatk_walkers_genotyper_UnifiedGenotyper.html