Tagged with #optimization
0 documentation articles | 0 announcements | 1 forum discussion

No articles to display.

No articles to display.

Created 2016-01-28 13:26:14 | Updated | Tags: haplotypecaller optimization faster

Comments (1)

Hello everyone,

I am using GATK in a clinical context for NGS diagnosis. The issue is that the HaplotypeCaller take some time, too much time actually (2h per patient). I tried this things :

  • reduce the bam file size by keeping only the genomic regions of my diagnosis genes but it looks like it still run all the hg19 genome.
  • ask "only variants" with the output_mode option but the output file is exactly the same than the default one.
  • use several CPU thread, but 1 CPU = 147 min, 2 CPU = 89 min, 3 CPU = 80 min. And I don't have this much CPU available so it is not interesting above 2 CPU , and still not fast enough.

I can't use the data thread option right now, would it allow me to gain more time than the CPU option ? There is the interval option but I don't think it would allow me to gain enough time since I have gene of interest on almost all chromosomes.

I would appreciate to have your guidance regarding this problem. How would you do to make this HaplotypeCaller step faster ?

Many thanks in advance.