[mvapich-discuss] cannot allocate CQ

Joshua Bernstein jbernstein at penguincomputing.com
Wed Jun 25 14:25:31 EDT 2008



Amit H Kumar wrote:
> Hi Joshua,
> 
> Thank you for a detailed explaination... more comments below ..
> 
>> I know the Christian already answered your question to show you how to
>> get it to work, but I thought it would be valuable to explain WHY adding
>> the ulimit setting to the SGE script works.
>>
>> The difference between running it outside of SGE is that when you run
>> the job outside of SGE, the SSHD daemon is actually the daemon forking
>> off your executable. The ulimit setting is inherited from SSHD daemon,
>> and thus the program is able to execute.
>>
>> Though when you launch the job from inside of SGE, the sge_execd is
>> actually responsible for forking off your executable, thus it must also
>> have a proper ulimit setting. This idea applies to most schedules, not
>> just SGE, so for TORQUE (or even PBSPro), you'd have to apply the ulimit
>> setting to the pbs_mom daemon.
>>
> That's interesting, I was under the impression that, since SGE uses SSH/RSH
> to logon to the compute nodes it should take care of the settings modified
> in the /etc/init.d/sshd. But now I believe that's not how it works. Since
> sge_execd daemon is the one that is forking the SSH proces, it is this
> sge_execd process that has to have the umilimted memlock capability.

Yes. Honestly, I'm less familiar with the way SGE works, but in the 
TORQUE world, and if you are running an MPI libraries with tm (task 
manager support, like OpenMPI, or if you use mpiexec for MPICH). SSH 
isn't actually used by the Schedular, instead a message is sent to the 
remote nodes schedular daemon and asked to for the process, rather then 
run SSH. I imagine something similar is happening with SGE. You can 
check to see if SSH is being used though by checking the output of ps, 
when you have a job running. If SSH is a child process of SGE, then you 
know that SSH inherits the ulimit from sge_execd. If you see that your 
application is a direct child of the sge_execd daemon, then you you know 
that you must be using some sort of mechanism to tell the remote 
sge_execd processes to fork the child processes for you.

-Joshua Bernstein

> Joshua, Is my reasoning correct?
> 
> Also have one more question:
> 
> Why didn't we have to do this before for MVAPICH2 versions lower than 1.0?
> What architerctural changes or interfaces(eg. VAPI etc.) mandates this ?
> 
> Thanks!
> «Amit»
> 
> 


More information about the mvapich-discuss mailing list