[mvapich-discuss] Problem: MPI process (rank: 0, pid: 3109) exited with status 1...

Jonathan Perkins perkinjo at cse.ohio-state.edu
Mon Jul 22 15:17:35 EDT 2013


Hello.  Can you try a new debug build to see if we can get more output
from this failure?  Try adding the following to your configure line and
rebuild.

    --disable-fast --enable-g=dbg

Just some things to think about.  Do you have your locked memory limit
set high enough?  You can check your current value via `ulimit -l'.  We
suggest setting this to unlimited.  Also, do you have an active firewall
between the two nodes.  Both mpirun_rsh and mpiexec need to be able to
connect to each of the machines used by the MPI application using ports
other than those used by ssh.

On Mon, Jul 22, 2013 at 09:08:00PM +0800, li.luo at siat.ac.cn wrote:
> Hi,
> 
> I want to use MVAPICH2 for GPU-GPU communication. I have installed mvapich1.9 (by root) on my two nodes with configuration:
> 
> ./configure --prefix=/opt/mvapich2-1.9-gnu --enable-shared --enable-cuda --with-cuda=/home/liluo/lib/cuda_5.0/ --disable-mcast
> 
> 
> and make by:
> 
> make -j4
> make install
> 
> Now I want to run the example cpi by my personal account liluo.
> 
> For np=2 on one single node, it works.
> 
> But it doesn't work for 2 nodes with hostfile as:
> 
> gpu1-ib
> 
> gpu2-ib
> 
>  the output error is the following:
> 
> 
> [liluo at gpu1 programs]$ mpirun_rsh -n 2 -hostfile hostfile ./cpi
> [cli_0]: aborting job:
> Fatal error in MPI_Init:
> Other MPI error
> 
> [gpu1:mpispawn_0][child_handler] MPI process (rank: 0, pid: 3109) exited with status 1
> [gpu1:mpispawn_0][readline] Unexpected End-Of-File on file descriptor 5. MPI process died?
> [gpu1:mpispawn_0][mtpmi_processops] Error while reading PMI socket. MPI process died?
> [cli_1]: aborting job:
> Fatal error in MPI_Init:
> Other MPI error
> 
> [gpu2:mpispawn_1][readline] Unexpected End-Of-File on file descriptor 5. MPI process died?
> [gpu2:mpispawn_1][mtpmi_processops] Error while reading PMI socket. MPI process died?
> [gpu2:mpispawn_1][child_handler] MPI process (rank: 1, pid: 3144) exited with status 1
> 
> 
> //////////
> I use node gpu2-ib as the host node.
> I can successfully ping gpu1-ib with gpu2-ib.
> 
> And the installation folder for /opt/mvapich2-1.9-gnu and the current folder( where ./cpi is in) on node gpu2-ib have been exported to node gpu1-ib.
> 
> What can I do?
> --
> Li Luo
> Shenzhen Institutes of Advanced Technology
> Address: 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, P.R.China
> Tel: +86-755-86392312,+86-15899753087
> Email: li.luo at siat.ac.cn
> 
> 
> 

> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss


-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list