I want to filter out the reads with lower mapping quality, but I can't find args? Which args I should setting and I should filter the low mapping quality reads in which step will be better? I see the follow link has been described about this, but I can find the args in haplotypecaller,IndelRealigner,BaseRecalibrator and other tools.
Could you please help me out with this problem?
I fail retrieving variant calling (.vcf ) using the GATK2.0, although with a similar example it works well. I compared both and I find a difference in Mapping Quality (the mapping quality of the example that works has 60 whereas the other has 255 -this last one is performed using bfast and gives this quality-)
Googling, I already find that this could be caused because of GATK doesn't take into account qualities of 255. Is it true? http://www.biostars.org/post/show/43540/gatk-baq-and-dbsnp-option-in-countvariates/ (Note than this solution affects to GATK1.4 and I am using the GATK2.0)
I also checked the reference genome was ok (and also the .bed file with the exom position).
I repeat the process changing manually the "255" to "60" with the same result.
Any ideas of what could be the problem?
The executed command :
java -jar /home/public/biotools/GATK_2.0/GenomeAnalysisTK.jar -T UnifiedGenotyper -R /home/public/mnt/cubix/public/biodata/hg19_norm/hg19.fa -L /home/public/mnt/cubix/public/biodata/hg19_norm/bed/all_captured_exomes_hg19.bed -I /home/public/test/outer_test/intermediate/AUT143_chr15_extract_cpy_sorted.bam -glm SNP -nt 4 -o /home/public/test/outer_test/intermediate/AUT143_chr15_extract_cpy_sorted.bam.vcf
$ samtools view AUT143_chr15_extract_cpy_sorted.bam | head
1629_189_1658_F3 0 15 20002423 255 2I48M * 0 0 TATCCAAATATCCACTTGCAGATTCCACGAAAATAGTGTTTCAAAACTGC 5:=59B98@>!!4<8C?72!!!!92::!!1373!!00!!-00!!.!!!!% XA:i:2 MD:Z:48 XE:Z:-----------3--------300-----1-----2---2----1--0-0- PG:Z:bfast RG:Z:sample IH:i:1 NH:i:1 HI:i:1 CM:i:10 NM:i:2 CQ:Z:!7B90A;/:A@<%>.>?5'.:?7>+/=)9'&22%%%&%(%%1&/%(/%%% OQ:Z:A
]E>!!!!RCUO!!6AM@!!44!!3?@!!6!!!!% AS:i:925 CS:Z:T03320100333301120131300020111200032211200211000203
1396_1160_190_F3 16 15 20006225 255 50M * 0 0 CTCAATCTAAAGATAGGTTCAACTCTCTGAGATGAGTGCACACATCACAA !!!!!!6/206B1!!!!38!!!!!2535:D41426372@A?86!!!!!!! XA:i:2 MD:Z:26G4T18 XE:Z:0-310--------2-3---2-00--------------------13-2-2- PG:Z:bfast RG:Z:sample IH:i:1 NH:i:1 HI:i:1 CM:i:13 NM:i:2 CQ:Z:!'''1+5-1:<A5%4.+''%8@+%1))'?%,?%1&%8%<<-.018:-%)- OQ:Z:!!!!!!RJGDRJ!!!!?M!!!!!;C?9T
F57;BKBC__TG!!!!!!! AS:i:475 CS:Z:T02121311111131122132221222200022013223220032201320
(I also try to change manually the 255 value in .sam file (and I added 60) with no values ...)
HI GATK - I am still using the GenomeAnalysisTK-1.6-5-g557da77 version for UnifiedGenotyper. This is probably a silly question, but is there a way to set a parameter for minimum mapping quality score for reads, in deciding whether to evaluate them for variant detection. I know there is a --min_base_quality_score parameter, but I don't see on for mapping quality. http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_gatk_walkers_genotyper_UnifiedGenotyper.html