[mvapich-discuss] mvapich2 problems on node with active mlx4_0 and nes0 hcas

Michael Wang mwang at fnal.gov
Thu Sep 5 15:42:34 EDT 2013


Hi,

We are using mvapich2-1.7 and are having issues with one node on our 
cluster that has both a Mellanox MT27500 IB adapter and a NetEffect 
NE020 10Gb ethernet adapter (this problem goes away when the iw_nes 
driver is disabled).  Here is the ibv_devinfo output for this node:


hca_id:	nes0
	transport:			iWARP (1)
	fw_ver:				3.21
	node_guid:			0012:5503:5cf0:0000
	sys_image_guid:			0012:5503:5cf0:0000
	vendor_id:			0x1255
	vendor_part_id:			256
	hw_ver:				0x5
	board_id:			NES020 Board ID
	phys_port_cnt:			1
		port:	1
			state:			PORT_ACTIVE (4)
			max_mtu:		4096 (5)
			active_mtu:		1024 (3)
			sm_lid:			0
			port_lid:		1
			port_lmc:		0x00
			link_layer:		Ethernet

hca_id:	mlx4_0
	transport:			InfiniBand (0)
	fw_ver:				2.10.700
	node_guid:			0002:c903:00fd:ace0
	sys_image_guid:			0002:c903:00fd:ace3
	vendor_id:			0x02c9
	vendor_part_id:			4099
	hw_ver:				0x0
	board_id:			MT_1060110018
	phys_port_cnt:			1
		port:	1
			state:			PORT_ACTIVE (4)
			max_mtu:		2048 (4)
			active_mtu:		2048 (4)
			sm_lid:			1
			port_lid:		5
			port_lmc:		0x00
			link_layer:		IB


To demonstrate the problem, I use the utility program "osu_bw" to run a 
simple test between two nodes on the IB cluster:


   $ mpiexec -launcher rsh -hosts dseb2,dsag -n 2 \
     /usr/mpi/gcc/mvapich2-1.7/tests/osu_benchmarks-3.1.1/osu_bw


which results in the following error:


[ring_startup.c:184]: PMI_KVS_Get error

[1] Abort: PMI Lookup name failed
  at line 951 in file 
/var/tmp/OFED_topdir/BUILD/mvapich2-1.7-r5140/src/mpid/ch3/channels/common/src/rdma_cm/rdma_cm.c
[mwang at dsfr1 ~]$ mpiexec -launcher rsh -hosts dseb2,dsag -n 2 
/usr/mpi/gcc/mvapich2-1.7/tests/osu_benchmarks-3.1.1/osu_bw
[ring_startup.c:184]: PMI_KVS_Get error

[1] Abort: PMI Lookup name failed
  at line 951 in file 
/var/tmp/OFED_topdir/BUILD/mvapich2-1.7-r5140/src/mpid/ch3/channels/common/src/rdma_cm/rdma_cm.c


The node with the IB and 10GbE adapters is "dsag".  If I replace this 
node in the command above with another node that only has the Mellanox 
hca but not the NetEffect 10GbE adapter, then everything runs fine and 
the bandwidth results are printed out.

I am not an expert but if I try re-running the above command with "-v" 
for a verbose output, I see the following PMI related messages which may 
be relevant to the experts on this list in helping troubleshoot this 
problem:


[proxy:0:1 at dsag] PMI response: cmd=get_result rc=0 msg=success 
value=(vector,(0,2,1))
[proxy:0:1 at dsag] [mpiexec at dsfr1] [pgid: 0] got PMI command: cmd=put 
kvsname=kvs_19955_0 key=HOST-1 value=-32873218
	.
	.
	.
	.
[proxy:0:0 at dseb2] got pmi command (from 4): get
kvsname=kvs_19955_0 key=MVAPICH2_0001
[proxy:0:1 at dsag] [mpiexec at dsfr1] [pgid: 0] got PMI command: cmd=get 
kvsname=kvs_19955_0 key=MVAPICH2_0001
[mpiexec at dsfr1] PMI response to fd 12 pid 4: cmd=get_result rc=-1 
msg=key_MVAPICH2_0001_not_found value=unknown


This is in contrast to a successful run where the corresponding lines 
would look like:


[proxy:0:1 at dseb3] PMI response: cmd=get_result rc=0 msg=success 
value=(vector,(0,2,1))
[mpiexec at dsfr1] [pgid: 0] got PMI command: cmd=put kvsname=kvs_19934_0 
key=MVAPICH2_0001 value=00000008:0048004a:0048004b:
	.
	.
	.
	.
[proxy:0:0 at dseb2] got pmi command (from 4): get
kvsname=kvs_19934_0 key=MVAPICH2_0001
[mpiexec at dsfr1] [pgid: 0] got PMI command: cmd=get kvsname=kvs_19934_0 
key=MVAPICH2_0001
[mpiexec at dsfr1] PMI response to fd 12 pid 4: cmd=get_result rc=0 
msg=success value=00000008:0048004a:0048004b:

I have tried passing environment variables like MV2_IBA_HCA=mlx4_0 to 
mpirun_rsh or mpiexec or even using a hostfile with node:rank:hca lines 
to force usage of the IB hca, but to no no avail.

I would greatly appreciate any help or insight I can get on this from 
the experts on this list.

Thanks in advance,

Mike Wang


More information about the mvapich-discuss mailing list