[mvapich-discuss] mvapich & mpiexec

Jimmy Tang jtang at tchpc.tcd.ie
Mon May 28 12:50:29 EDT 2007


Hi All,

> koop at cse.ohio-state.edu wrote on Wed, 09 May 2007 14:56 -0400:
> > > If mvapich work witch mpiexec? I use  MVAPICH 0.9.9-beta and mpiexec 0.82.
> > > When I try run tasks I see messages:
> > >      mpiexec: Warning: read_ib_one: protocol version 5 not known, but
> > > might still work. But nothig to do. If last mvapich version is not
> > > supported by mpiexec. Or are there other methods to run mpi tasks with
> > > batch system? I use the latest torque batch system. Batch scripts with
> > > mpirun work, but for that is needed install distributive of mvapich on
> > > all nodes.
> > 
> > Currently there is no way to enable the old startup protocol in 0.9.9.
> > 
> > mpiexec will need to be updated to accommodate the new startup protocol
> > that is used in 0.9.9.
> 
> Matt,
> 
> With Jan's encouragement and testing, we may have working support
> for mvapich 0.9.9 in mpiexec.  It is a more complex change than we
> had imagined.
> 
> Like in previous mvapich versions, each task in the parallel job
> contacts the job launcher and gives it some information; then after
> they have all connected, the job launcher sends back the entire set
> of information to all processes.  That is typical for MPI job
> startup.  What's new in this version is an entire second phase of
> connections from each task for more information, with a global
> scatter.
> 
> Before I check it in, can you provide some commentary on the reason
> for the two sets of accept(), read(), write() for every task?  On
> the surface this would seem like a major barrier to scalability.
> There must be a good reason to get the hostids out first, then have
> the tasks come back for the addresses.
> 
> The comment so far is:
> 
> * Version 5:
> *   Added another phase, with socket close/reaccept between the two.  No
> *   clue why this is necessary.  First phase distributes hostids:
> *   ...
> 
> Egor, 
> 
> The current working patch is below.  It applies against SVN, but
> should probably work okay against 0.82 too.  If you would be willing
> to verify that this works on your system, I can check it in with
> confidence and put a notice on the web page for others.
> 

I've just applied the patch to the 0.82 release of mpiexec and it happily 
works with the mvapich 0.9.9 release.


Jimmy.

-- 
Jimmy Tang
Trinity Centre for High Performance Computing,
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/ | http://www.tchpc.tcd.ie/~jtang




More information about the mvapich-discuss mailing list