[mvapich-discuss] Master-slave configuration with MVAPICH2 - idb
Tony Ladd
tladd at che.ufl.edu
Fri Jul 22 23:00:46 EDT 2011
Hi Jonathan
Thanks for the info. However I think I am using gigabit and ipoib
together - otherwise I could not get the 1.8GB/s bandwidth. It may be
that the bandwidth of IPoIB is sufficient for my application - I will
need to think about that.
It seems to me that what I am trying to do is not so unreasonable. The
master holds a fair bit of data (of the order of 10-20GB in my case)
which it occasionally distributes to the slaves (which have 12GB of RAM
each). The slaves talk to each other on a much more frequent basis, and
now and then return results to the master. It is possible to program
this entirely in data parallel mode but that is a less attractive option.
Is there a technical reason why mvapich2 cannot support multiple
interface types? Is it that only one communication device (like rdma or
ch3) can be supported within a single mpi job?
But thanks for the reply - it was v. informative.
Tony
On 07/22/2011 10:31 PM, Jonathan Perkins wrote:
> Hello Tony. Unfortunately, you will not be able to use mvapich2 in
> the way that you intend. If you want to include your login node in
> the mpi communication with the compute nodes you will only be able to
> use the gigabit network. If you do not include the login node you may
> choose to use one of the following: gigabit, ipoib, or native
> infiniband depending on how you configure mvapich2 and which hostfiles
> you provide.
>
> In order to use the gigabit network, you will configure using the
> --with-device=ch3:sock option and specify the gigabit interface in
> your hostfiles.
>
> In order to use ipoib, you will again configure using the
> --with-device=ch3:sock but specify the ipoib interface in your
> hostfiles.
>
> In order to use infiniband, you will configure using
> --with-device=ch3:mrail (the default). In this case you do not need
> to specify a particular interface in the hostfile.
>
> I will also like to mention that the --with-rdma option is only valid
> if you use the ch3:mrail device. Here this option is used to select
> between gen2 and udapl APIs of communicating over infiniband. The
> --with-rdma option is ignored when using the ch3:sock device.
>
> Please let me know if this answers any questions about the issue
> you're facing and let me know if there is anything more that you'd
> like to know.
>
> On Fri, Jul 22, 2011 at 7:00 PM, Tony Ladd<tladd at che.ufl.edu> wrote:
>
>> I have an application with a master-slave setup. The master node is the
>> login server and has Gigabit ethernet connections to the slaves. The slaves
>> also have an infiniband fabric connecting them to each other but not to the
>> master.
>>
>> I have tested the infiniband interconnect between 2 slaves using the default
>> configuration of mvapich_1.6 and the performance is as expected - a
>> bandwidth of 5.5GB/s on a SendRecv (it is 4xQDR connection). The problems
>> are after configuring MVAPICH2 with the gigabit interconnect as well. I made
>> a new version of mvapich with gen2 and ch3:sock - the output from mpiname -a
>> for this version is:
>>
>> ---------------------------------------------------------------------
>> MVAPICH2 1.6rc1 2010-11-12 ch3:sock
>>
>> Compilation
>> CC: gcc -DNDEBUG -O2
>> CXX: g++ -DNDEBUG -O2
>> F77: g77 -DNDEBUG -O2
>> F90: gfortran -DNDEBUG -O2
>>
>> Configuration
>> --prefix=/global/lib/checs/mvapich2-gnu --with-rdma=gen2
>> --with-device=ch3:sock
>> ----------------------------------------------------------------------
>>
>> When I run the communication test I get only about 1.8GB/s. I suspect it is
>> using the IPOIB protocol. If so how can I force it to use the rdma protocol.
>> The run setup is as follows:
>>
>> mpdboot --totalnum=2 --file=hosts
>> mpiexec -machinefile procs -n 3 prog2
>>
>> hosts
>> f1
>> f2
>>
>> procs
>> f1:2 ifhn=f1g
>> f2:1 ifhn=f2g
>>
>> I used mpiexec, since with mpirun_rsh I was not able to get it to use the IB
>> connection at all - it always used ethernet even if the hostnames pointed to
>> the ib0 interface: for example "mpirun_rsh -np 3 f1g f1g f2g prog2" would
>> get about 0.2GB/s.
>>
>> In this case I am using one of the IB nodes as the master but in general I
>> want to use a login node that only has GigE. Here f1g and f2g point to ib0
>> while f1 and f2 point to eth0.
>>
>> Can someone help me out here. Does it look like a run time problem (ie a
>> switch to tell it to use ibverbs) or is it in the build of MVAPICH2
>>
>>
>> Thanks
>>
>> Tony
>>
>> --
>> Tony Ladd
>>
>> Chemical Engineering Department
>> University of Florida
>> Gainesville, Florida 32611-6005
>> USA
>>
>> Email: tladd-"(AT)"-che.ufl.edu
>> Webhttp://ladd.che.ufl.edu
>>
>> Tel: (352)-392-6509
>> FAX: (352)-392-9514
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>>
>
>
>
More information about the mvapich-discuss
mailing list