[mvapich-discuss] dynamic process connections (accept/connect or MPI_Comm_join) and Infiniband...

leon zadorin leonleon77 at gmail.com
Tue Apr 15 20:06:19 EDT 2008


Hello everyone,

It had occurred to me that my original post of my questions (to
mvapich-commit) would be better if emailed to this list instead... so
here it is...

I am relatively new to the whole MPI/Infiniband scene, so my apologies
if some of the questions/thoughts of mine are naive...

I am currently experiencing difficulties with dynamic process
connections (MPI_Comm_join) between 2 hosts (each with Infiniband and
Ethernet card).

The setup is:
2 hosts, each with Ethernet (Gigabit) card and with Infiniband card
(PCI-e), running Linux 32 bits (AMD arch).
Hosts are connected via Infiniband switch (w.r.t Infiniband cards) and
via Ethernet/IP network (w.r.t. Ethernet cards).
mvapich2 has been made with "make.mvapich2.ofa"
mpdboot has been executed and mpd daemons are running on both hosts

I would like to know if it is currently possible to achieve the following:

1) start 1 app on 1 host (without using mpirun);
2) then later, after some time, start another app on 2nd host (without
using mpirun);
3) make the app in step 2 automatically connect to the app started in step 1

I was able to achieve the above when running with mpich2 library,
using sock channels and only when using 'MPI_Comm_join' call (using
MPI_Publish_name, etc. did not work when starting apps without mpirun
[even with all mpds being active]).

However, the MPI_Comm_join tactic fails when attempting to use
mvapich2 (mvapich2-1.0-2008-04-10) over Infiniband... I wonder if the
following has something to do with it:
http://lists.openfabrics.org/pipermail/commits/2006-January/004707.html
"
--------------------------------------------------------------------------------
-                              Known Deficiencies
--------------------------------------------------------------------------------
-
-- The sock channel is the only channel that implements dynamic process support
-  (i.e., MPI_COMM_SPAWN, MPI_COMM_CONNECT, MPI_COMM_ACCEPT, etc.).  All other
-  channels will experience failures for tests exercising dynamic process
-  functionality.
"
and in http://lists.openfabrics.org/pipermail/commits/2006-May/007209.html
we have:
"
-- MPI_COMM_JOIN has been implemented; although like the other dynamic process
-  routines, it is only supported by the Sock channel.
"

Given that above quotes mentioned both the MPI_Comm_join and
MPI_Comm_connect ... is there any way at all to currently achieve the
above 3 steps when using Infiniband cards (and may be having Ethernet
cards on all of the hosts as well)?

I would imagine that, albeit theoretically, it is plausible to use
sock channel to 'bootstrap' the Infiniband channel?
http://www.mpi-forum.org/docs/mpi-20-html/node115.htm
"
MPI uses the socket to bootstrap creation of the intercommunicator,
and for nothing else.
"

Perhaps I need to build mvapich2 not via "make.make.mvapich2.ofa" but
something else so that both: socket and infiniband channels are
supported?

Of course the same aforementioned link
(http://www.mpi-forum.org/docs/mpi-20-html/node115.htm)
says:
"
 Advice to users. An MPI implementation may require a specific
communication medium for MPI communication, such as a shared memory
segment or a special switch. In this case, it may not be possible for
two processes to successfully join even if there is a socket
connecting them and they are using the same MPI implementation. ( End
of advice to users.)
"

If this is the case here and there is no way to use MPI_Comm_join to
achieve the originally described 3 steps (connecting apps started at
different times and without the use of mpirun) - is that then at all
possible (e.g. using MPI's open port, publish name, lookup name,
accept/connect calls)? Are the limitations purely theoretical or more
of a practical nature?

Ideally, for async. server design purposes and, given that
MPI_Comm_accept is blocking and there is no 'test'/'poll' for it, it
would be good to be able to use sockets channel to coordinate
infiniband channel bootstrapping with MPI_Comm_join (even if
MPI_Comm_join in itself is blocking, at least one can 'poll' for the
TCP's socket's fd before calling 'accept' and subsequently
MPI_Comm_join)...

If mvapich2 is unable to provide dynamic process connectivity over
Infiniband... are there any other libs that could do that?

Kind regards
Leon.


More information about the mvapich-discuss mailing list