I browsed through the forum and found some users have the same problem with me. This is the BaseRecalibrator walker. My command works fine for generating the -grp file and printRead for the new bam files. However, in the standard error files, I found that there are a large proportion of reads fail the MappingQualityZeroFilter.
NFO 05:37:33,854 ProgressMeter - Total runtime 51462.91 secs, 857.72 min, 14.30 hours INFO 05:37:33,855 MicroScheduler - 263828269 reads were filtered out during the traversal out of approximately 660528125 total reads (39.94%) INFO 05:37:33,855 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter INFO 05:37:33,856 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter INFO 05:37:33,856 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter INFO 05:37:33,856 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter INFO 05:37:33,857 MicroScheduler - -> 262748250 reads (39.78% of total) failing MappingQualityZeroFilter INFO 05:37:33,857 MicroScheduler - -> 1080019 reads (0.16% of total) failing NotPrimaryAlignmentFilter INFO 05:37:33,857 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter INFO 05:37:35,741 GATKRunReport - Uploaded run statistics report to AWS S3
This makes me hesitate to move on.
My working pipeline followed strictly to the GATK best practice, using BWA mem for alignment and the samtools showed over 90% of reads were mapped the reference genome. I understand that it may be beyond the support. But, I really donot know how to go with this problem. How to tackle it? Can I move on with this? If not, which place to tackle with this problem?
I would like to hear your suggestions.
Hello GATK team,
BaseRecalibrator applies the filters: DuplicateReadFilter MappingQualityZeroFilter I've noticed that in the bam after PrintReads, most of those reads indeed filtered out, but few of them were left - about 2% reads that were marked as dups by picard, and 4% reads with a mapping quality zero.
What exactly happens when a tool applies a filter?