Tagged with #python
0 documentation articles | 0 announcements | 2 forum discussions

No articles to display.

No articles to display.

Created 2014-09-29 15:28:12 | Updated | Tags: parallelism python

Comments (3)

Hi, I have 190 samples that I am running through the GATK DNAseq pipeline following the Best Practices. Since only a few genes have been sequenced for each sample, the alignment files are very small (0.5 GB BAM files), but even then processing each samples takes about 3 hours. Is there a way to parallelize the processing of the individual samples on a mutli-core machine (since processing each sample is independent of each other it should not make a difference). There is a feature in Python using Pool in the multiprocessing module that could be used. I tried it but it does not seem to work for me. Does the GATK team have any guidance or information on this issue. Thanks,

  • Pankaj

Created 2012-12-28 07:08:02 | Updated 2013-01-07 19:14:29 | Tags: developer python wrappers

Comments (5)

Hi all,

Wouldn't it be useful to have GATK wrapped into a Python API? pygatk. As pysam is for samtools or pybedtools is for bedtools. Is anybody developing this?

Regards, Pablo.