[SPAM] Re: [mvapich-discuss] [SPAM] "PMI Lookup name failed" when RDMA CM is used

Xie Min xmxmxie at gmail.com
Wed Mar 25 07:21:44 EDT 2009


We met the problem too, so we use PMI_DEBUG=1 to see the output of Slurm PMI,
mvapich2 use "ip<rank> " to call PMI_KVS_Get(), please notice there is
a "blank" in the parameter.
Delete the "blank" in rdma_cm.c, use "ip<rank>", seems Slurm PMI can work.

2009/3/25 Marcus R. Epperson <mrepper at sandia.gov>:
> We have an Infiniband cluster which will require the use of RDMA CM, and
> which uses the Slurm resource manager for job launch.  I'm trying to
> verify that mvapich2-1.2p1 will work with this combination but I'm not
> having much luck so far.
>
> I am able to run successfully when I don't enable mvapich2's RDMA CM
> option (this won't be possible long-term though):
>
> $ srun --mpi=none -w 'c1,c3' ./mpi_hello
>   Hello, I am node c1 with rank 0
>   Hello, I am node c3 with rank 1
>
> But when I enable it I get this:
>
> $ export MV2_USE_RDMA_CM=1
> $ srun --mpi=none -w 'c1,c3' ./mpi_hello
>   [1] Abort: PMI Lookup name failed
>    at line 810 in file rdma_cm.c
>   [0] Abort: PMI Lookup name failed
>    at line 810 in file rdma_cm.c
>   srun: error: c1: task 0: Exited with exit code 253
>   srun: error: c3: task 1: Exited with exit code 253
>
> I believe these nodes are configured correctly according to #6.4 here:
>
> http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.2.html#x1-300006.4
>
> i.e. IPoIB is set up:
>
> # pdsh -w 'c[1,3]' "ifconfig ib0 | grep inet.addr"
> c1:   inet addr:192.168.2.1  Bcast:192.168.2.255  Mask:255.255.255.0
> c3:   inet addr:192.168.2.3  Bcast:192.168.2.255  Mask:255.255.255.0
>
> and mv2.conf is present on each node:
>
> # pdsh -w 'c[1,3]' "cat /etc/mv2.conf"
> c1: 192.168.2.1
> c3: 192.168.2.3
>
> Have I missed something, or is this a bug?  If it's a bug, is it with
> mvapich2 or should I be looking elsewhere?
>
> Thanks for any help,
> -Marcus Epperson
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list