I am using GATK through the Galaxy main server to analyze variations from whole-genome re-sequencing of various samples of non-model species (nematodes worms). I would like to know whether it is possible to have with Galaxy's GATK tools a kind of pileup (base per base or intervall, like .bed) of genome indicating specifically which base where callable or not by Unified Genotyper (UG), such as "CallableLoci". The log & metrics files generated by UG in Galaxy give the general statistics of callable loci, but there is no such a file giving a detailed information of the eligibility of each base.
In the same kind of idea, I would like to get a per-locus-depth of coverage (which can partially help answering my previous question, although it does not take into account all the filters used by UG such as base quality, mapping quality, etc.). This tool is available on Galaxy. However, I am performing 3 rounds of BQSR to get my final vcf file. Shall I calculate the depth of coverage using the first BAM file before BQSR or the last recalibrated BAM file obtained in the 3rd round of BQSR? I don't think BQSR alter the coverage score, so I would say this shouldn't matter. Am I right?
Thanks in advance for help and advices, Fabrice
I have a set of CNV regions, and I would like to see how much my samples are overlapped with those regions. So I used DepthCoverage from GATK and fillted the interval parameters with my CNVs.
ERROR MESSAGE: Badly formed genome loc: Contig 3 given as location, but this contig isn't present in the Fasta sequence dictionary
I am sure the reference file is correct; I think, it could be because of that, these CNVs are large, therefore some part of them could be outside of the reference contig. Is there any way that either I can make my CNVs as proper and hg19 compatible bed file, or any other tools that can help me with that.