[mvapich-discuss] mvapich2 and MPI subjobs

James R. Leek leek2 at llnl.gov
Fri Apr 15 14:54:00 EDT 2011


In a previous email Jonathan Perkins suggested I upgrade from 
mvapich-1.2rc1 to mvapich2-1.6 in order to get a useful feature.  I had 
problems in the past building mvapich2 on the machine in question due to 
ignorance.  However, this time, with what I learned from building 
mvapich-1.2rc1, I was able to get 2-1.6 to build.

However, I have an issue.  What I really need is to be able to launch 
MPI jobs from inside an MPI job in the same allocation.  This is why I 
can't use the officially supported copy of OpenMPI on the system.  (You 
can't launch MPI subjobs from inside an MPI job with OpenMPI.)  This 
works for me with mvaich-1.2rc, but it is not working with 2-1.6.

So, if I just run a regular MPI job, it works:

[leek2 mpitest2]$ mpiexec -np 1 ./mpi-helloworld
Hello World from 0!
COOP_name: hi

However, my mpilaunch program will start MPI, then run a copy of 
mpi-helloworld, as above:


[leek2 mpitest2]$ mpiexec -np 1 ./mpilaunch $PBS_NODEFILE
PBS_NODEFILE: /var/spool/PBS/aux/385337.madm2
forking
Waiting on pid 31473
execing mpi-helloworld
[proxy:0:0 at m0327.mana] Ctrl-C caught... cleaning up processes
[leek2 mpitest2]$ $?

It just fails and claims "Ctrl-C caught."  I sure didn't hit Ctrl-C.  
Does anyone know what might be causing this?

Thanks,

-- 
Jim Leek
leek2 at llnl.gov



More information about the mvapich-discuss mailing list