Tagged with #ulimit
0 documentation articles | 0 announcements | 1 forum discussion


No posts found with the requested search criteria.
No posts found with the requested search criteria.

Created 2014-01-23 10:42:02 | Updated | Tags: exome ug ulimit
Comments (3)

Hi there

I am trying to run UG across just over 2,200 individuals (exome sequencing). I have successfully done this on our computing cluster with just over 1,000 samples without issues (apart from having to get the limit on no. of open files (ulimit) increased).

I got another increase in ulimit to allow me to run UG on the larger set. However, our IO is being pushed over the edge with the 2,200 input samples. I have two questions:

  • does UG open all of the input bam files at the same time? It seems like it, since a ulimit of 2048 was not sufficient for 2,200 input files.
  • is there a way to optimise this, possibly by getting UG to open files sequentially - or do they have to be all open at the same time? I suspect this will become more of a problem as the size of the datasets available increases.

Would appreciate any advice you would have on getting this to run on this size of data. Thanks!