Scatter/Gather utilizing multiple cores in the same node/job
Posted in Ask the team | Last updated on 2013-04-22 21:03:16

Comments (2)

At the Minnesota Supercomputing Institute, our environment requires that jobs on our high performance clusters reserve an entire node. I have implemented my own Torque Manager/Runner for our environment based on the Grid Engine Manager/Runner. The way I have gotten this to work in our environment is to set the nCoresRequest for the scatter/gather method to the minimum required of eight. My understanding is that for the InDelRealigner, for example, the job reserves a node with eight cores, but only uses one. That means our users would have their compute time allocation consumed eight times faster than is necessary.

What I am wondering is are there options that I am missing where some number of the scatter/gather requests can be grouped into a single job submission? If I were writing this as a PBS script for our environment and I wanted to use 16 cores in a scatter/gather implementation, I would write two jobs, each with eight commands. They would look something like the following:

#PBS Job Configuration stuff
pbsdsh -n 0 java -jar ... &
pbsdsh -n 1 java -jar ... &
pbsdsh -n 2 java -jar ... &
pbsdsh -n 3 java -jar ... &
pbsdsh -n 4 java -jar ... &
pbsdsh -n 5 java -jar ... &
pbsdsh -n 6 java -jar ... &
pbsdsh -n 7 java -jar ... &

Has anyone done something similar in Queue? Any pointers? Thanks in advance!

Return to top Comment on this article in the forum