[mvapich-discuss] Master-slave configuration with MVAPICH2 - idb

Jonathan Perkins perkinjo at cse.ohio-state.edu
Fri Jul 22 22:31:16 EDT 2011


Hello Tony.  Unfortunately, you will not be able to use mvapich2 in
the way that you intend.  If you want to include your login node in
the mpi communication with the compute nodes you will only be able to
use the gigabit network.  If you do not include the login node you may
choose to use one of the following: gigabit, ipoib, or native
infiniband depending on how you configure mvapich2 and which hostfiles
you provide.

In order to use the gigabit network, you will configure using the
--with-device=ch3:sock option and specify the gigabit interface in
your hostfiles.

In order to use ipoib, you will again configure using the
--with-device=ch3:sock but specify the ipoib interface in your
hostfiles.

In order to use infiniband, you will configure using
--with-device=ch3:mrail (the default).  In this case you do not need
to specify a particular interface in the hostfile.

I will also like to mention that the --with-rdma option is only valid
if you use the ch3:mrail device.  Here this option is used to select
between gen2 and udapl APIs of communicating over infiniband.  The
--with-rdma option is ignored when using the ch3:sock device.

Please let me know if this answers any questions about the issue
you're facing and let me know if there is anything more that you'd
like to know.

On Fri, Jul 22, 2011 at 7:00 PM, Tony Ladd <tladd at che.ufl.edu> wrote:
> I have an application with a master-slave setup. The master node is the
> login server and has Gigabit ethernet connections to the slaves. The slaves
> also have an infiniband fabric connecting them to each other but not to the
> master.
>
> I have tested the infiniband interconnect between 2 slaves using the default
> configuration of mvapich_1.6 and the performance is as expected - a
> bandwidth of 5.5GB/s on a SendRecv (it is 4xQDR connection). The problems
> are after configuring MVAPICH2 with the gigabit interconnect as well. I made
> a new version of mvapich with gen2 and ch3:sock - the output from mpiname -a
> for this version is:
>
> ---------------------------------------------------------------------
> MVAPICH2 1.6rc1 2010-11-12 ch3:sock
>
> Compilation
> CC: gcc  -DNDEBUG -O2
> CXX: g++  -DNDEBUG -O2
> F77: g77  -DNDEBUG -O2
> F90: gfortran  -DNDEBUG -O2
>
> Configuration
> --prefix=/global/lib/checs/mvapich2-gnu --with-rdma=gen2
> --with-device=ch3:sock
> ----------------------------------------------------------------------
>
> When I run the communication test I get only about 1.8GB/s. I suspect it is
> using the IPOIB protocol. If so how can I force it to use the rdma protocol.
> The run setup is as follows:
>
> mpdboot --totalnum=2 --file=hosts
> mpiexec -machinefile procs -n 3 prog2
>
> hosts
> f1
> f2
>
> procs
> f1:2 ifhn=f1g
> f2:1 ifhn=f2g
>
> I used mpiexec, since with mpirun_rsh I was not able to get it to use the IB
> connection at all - it always used ethernet even if the hostnames pointed to
> the ib0 interface: for example "mpirun_rsh -np 3 f1g f1g f2g prog2" would
> get about 0.2GB/s.
>
> In this case I am using one of the IB nodes as the master but in general I
> want to use a login node that only has GigE. Here f1g and f2g point to ib0
> while f1 and f2 point to eth0.
>
> Can someone help me out here. Does it look like a run time problem (ie a
> switch to tell it to use ibverbs) or is it in the build of MVAPICH2
>
>
> Thanks
>
> Tony
>
> --
> Tony Ladd
>
> Chemical Engineering Department
> University of Florida
> Gainesville, Florida 32611-6005
> USA
>
> Email: tladd-"(AT)"-che.ufl.edu
> Webhttp://ladd.che.ufl.edu
>
> Tel:   (352)-392-6509
> FAX:   (352)-392-9514
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list