Tagged with #speed
0 documentation articles | 1 announcement | 1 forum discussion


No posts found with the requested search criteria.
Comments (2)

We all know how HaplotypeCaller analyses can take a long time. IBM is now providing a native implementation of the PairHMM algorithm that leverages the new hardware available in their POWER8 systems. This optimization currently work on the following systems: Ubuntu14 and RHEL7 with POWER8.

To take advantage of this optimization, you need to do the following:

Here is an example for running on a P8 system with Ubuntu:

java -Xmx32g -Djava.library.path=/path/to/PairHMM_P8_Ubuntu -jar $GATK_PATH/GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-R $REFERENCE -I $INPUT_BAM --dbsnp $SNP_VCF \
-stand_emit_conf 10 -stand_call_conf 50 \
--pair_hmm_implementation VECTOR_LOGLESS_CACHING \
-o $OUTPUT_VCF

You can expect a speedup in the range of 1-1.7x depending on the hardware, OS and test cases.

If you have any questions or issues (aside from downloading the file), please contact Yinhue Cheng at IBM (ycheng@us.ibm.com).

Comments (4)

Hi,

I'm running BaseRecalibrator on a full lane of HiSeq. I set the -nct switch to 8, but I see that the average load is <1. Why is this so? Doesn't the BaseRecalibrator use threads properly?

I've seen this thread http://gatkforums.broadinstitute.org/discussion/1263/baserecalibrator-trade-off-between-runtime-and-accuracy on how to speed up the process by downsampling. Is this a better option considering I have >200M reads?

BTW I'm using version 2.2-8 - do newer versions implement -num_threads?

I would really love to reduce the runtime from 36hrs.

Many thanks.