I wish to look for SNPs that are unique to my case samples, and do not appear in my control samples. To be more precise, their enrichment in the case samples should be significant. Something like a fisher's test.
I have many bam files of control samples, and many of cases. How can I run this?
This is bacterial DNA, so as far as we know, there is a single copy of each chromosome.
Hi, I was wondering if you have any advice on the effects of including samples of different ancestry in the multiple sample variant calling. We run PCA after calling variants to identify any potential outliers. In case of identifying one, do we have to re-do the SNP and Indel calling step?
Any help most appreciated.
I used both haploid version (-ploidy flag set to 1) and the default version of UG to call SNPs between a reference assembly and sequence reads that come from the same haploid genome.
In the former case I get about 125 alt. hom calls and in the diploid mode of calling about 1,140 het and 120 alt. hom SNPs. Since the comparison is being done within haplotypes, I don’t expect any variation at all. What could be the basis for noise that is observed?
Thank you for your input!