I notice that although UnifiedGenotyper and HaplotypeCaller identify indels, none or reported greater than a length of around 50bp. Obviously integrating this ability into the already complicated algorithms is not easy, nor are such variants especially common - but they can easily be biologically-important, as well as interfered SNP calls, e.g. if a SNP is in a heterozygous deletion.
Do you know of any callers for larger indels/CNVs, specifically for multi-sample NGS projects ? I've looked at pindel and break-dancer (and probably a few others) but they seem mostly for single samples.
I am interested in finding copy number variation in my samples. I have looked for SNPS and INDELS with GATK UnifiedGenoTyper (still have to use it with haplotypecaller). Is there a walker to find CNV's (duplications or deletions) in GATK?
Hope to hear from you soon.
Hello Geraldine et al,
I've a question about CNV calling, which you might or might have an answer for. We're doing a case/control analysis on two cohorts, and one of the analyses we'd like to carry out is an examination of CNV length - one thing we want to do is analyse by genome (average site above/below average, say), and by region (same general idea).
While calling SNPs and indels seems straightforward enough with UG, I wonder if you have a best practice for calling CNV - or rather, a candidate for what might become an integrated best practice? Maybe I can even help with maturation/integration.
Thanks for your relentless work on the site, tools and community, this is the place to be.