[mvapich-discuss] Rank Placement in MVAPICH

Hari Subramoni subramon at cse.ohio-state.edu
Mon Apr 11 09:16:40 EDT 2011


Hi Vaibhav,

Which job launcher are you using? I'm assuming that you are using
mpirun_rsh. If so, your hostfile ordering seems proper to launch the job
with a block ordering of ranks.

Do you have password-less ssh set up between the node you are launching
your MPI job from and all the nodes specified in your hostfile?

What application are you trying to run here? Can you try running a simple
application such as 'cpi' available under the examples sub-directory of
the MVAPICH2 source files. This will allow us to eliminate application
errors.

Also, I just saw that you'd mentioned 'MVAPICH' in the subject line. This
is an older release which does not have a lot of the latest bug fixes and
performance enhancements we have been doing. We recommend that you switch
to MVAPICH2 to get the best performance and the latest bug fixes. You can
download the tarball containing the latest release of MVAPICH2 from the
following site

http://mvapich.cse.ohio-state.edu/download/mvapich2/

Thx,
Hari.


On Fri, 8 Apr 2011, vaibhav dutt wrote:

> Hi,
>
> I tried to change the hostfile in this manner but I am getting an error like
>
>
>  child_handler: Error in init phase...wait for cleanup! (0/5mpispawn
> connections)
> Failed in initilization phase, cleaned up all the mpispawn!
>
>
> I am trying to run 32 processes across 4 nodes and my hostfile looks like
>
>
> compute-0-0.local
> compute-0-0.local
> compute-0-0.local
> compute-0-0.local
> compute-0-0.local
> compute-0-0.local
> compute-0-0.local
> compute-0-0.local
> compute-0-1.local
> compute-0-1.local
> compute-0-1.local
> compute-0-1.local
> compute-0-1.local
> compute-0-1.local
> compute-0-1.local
> compute-0-1.local
> compute-0-2.local
> compute-0-2.local
> compute-0-2.local
> compute-0-2.local
> compute-0-2.local
> compute-0-2.local
> compute-0-2.local
> compute-0-2.local
> compute-0-3.local
> compute-0-3-local
> compute-0-3.local
> compute-0-3.local
> compute-0-3.local
> compute-0-3.local
> compute-0-3.local
> compute-0-3.local
>
> Thanks
> On Thu, Apr 7, 2011 at 12:56 PM, Hari Subramoni <subramon at cse.ohio-state.edu
> > wrote:
>
> > Hi Vaibhav,
> >
> > To get a block allocation just repeat n1 8 times then n2 8 times in your
> > hostfile. This will change the allocation to block.
> >
> > 1: n1
> > 2: n1
> > ..
> > 8: n1
> > 1: n2
> > 2: n2
> > ..
> > 8: n2
> >
> > Please refer to the MVAPICH2 userguide at the following link for more
> > information
> >
> >
> > http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.6.html#x1-240005.2.1
> >
> > Thx,
> > Hari.
> >
> >
> > On Thu, 7 Apr 2011, vaibhav dutt wrote:
> >
> > > Hi,
> > >
> > > I am trying to execute a job  of 16 processes across two nodes and my
> > > hostfile looks like
> > >
> > > n1:8
> > > n2:8.
> > >
> > > I notice that the ranks   0,2,4,6,8,10,12,14 are on node 1 and
> > > 1,3,5,7,9,11,13,15 are on node 2 which seems to be
> > > cyclic distribution. Is this the default rank placement strategy in
> > MVAPICH
> > > as I did not give any special option on runtime
> > > for the rank distribution to be cyclic. How to change it to block?
> > >
> > > Thanks
> > >
> >
> >
>



More information about the mvapich-discuss mailing list