[mvapich-discuss] Master-slave configuration with MVAPICH2 - idb

Tony Ladd tladd at che.ufl.edu
Fri Jul 22 19:00:03 EDT 2011


I have an application with a master-slave setup. The master node is the 
login server and has Gigabit ethernet connections to the slaves. The 
slaves also have an infiniband fabric connecting them to each other but 
not to the master.

I have tested the infiniband interconnect between 2 slaves using the 
default configuration of mvapich_1.6 and the performance is as expected 
- a bandwidth of 5.5GB/s on a SendRecv (it is 4xQDR connection). The 
problems are after configuring MVAPICH2 with the gigabit interconnect as 
well. I made a new version of mvapich with gen2 and ch3:sock - the 
output from mpiname -a for this version is:

---------------------------------------------------------------------
MVAPICH2 1.6rc1 2010-11-12 ch3:sock

Compilation
CC: gcc  -DNDEBUG -O2
CXX: g++  -DNDEBUG -O2
F77: g77  -DNDEBUG -O2
F90: gfortran  -DNDEBUG -O2

Configuration
--prefix=/global/lib/checs/mvapich2-gnu --with-rdma=gen2 
--with-device=ch3:sock
----------------------------------------------------------------------

When I run the communication test I get only about 1.8GB/s. I suspect it 
is using the IPOIB protocol. If so how can I force it to use the rdma 
protocol. The run setup is as follows:

mpdboot --totalnum=2 --file=hosts
mpiexec -machinefile procs -n 3 prog2

hosts
f1
f2

procs
f1:2 ifhn=f1g
f2:1 ifhn=f2g

I used mpiexec, since with mpirun_rsh I was not able to get it to use 
the IB connection at all - it always used ethernet even if the hostnames 
pointed to the ib0 interface: for example "mpirun_rsh -np 3 f1g f1g f2g 
prog2" would get about 0.2GB/s.

In this case I am using one of the IB nodes as the master but in general 
I want to use a login node that only has GigE. Here f1g and f2g point to 
ib0 while f1 and f2 point to eth0.

Can someone help me out here. Does it look like a run time problem (ie a 
switch to tell it to use ibverbs) or is it in the build of MVAPICH2


Thanks

Tony

-- 
Tony Ladd

Chemical Engineering Department
University of Florida
Gainesville, Florida 32611-6005
USA

Email: tladd-"(AT)"-che.ufl.edu
Webhttp://ladd.che.ufl.edu

Tel:   (352)-392-6509
FAX:   (352)-392-9514



More information about the mvapich-discuss mailing list