[mvapich-discuss] Rank Placement in MVAPICH

vaibhav dutt vaibhavsupersaiyan9 at gmail.com
Mon Apr 11 10:29:46 EDT 2011


Hi,

Thanks for your help.Actually there was an error in my hostfile as pointed
by Dan.
After changing that, it started working.

Thanks

On Mon, Apr 11, 2011 at 8:16 AM, Hari Subramoni <subramon at cse.ohio-state.edu
> wrote:

> Hi Vaibhav,
>
> Which job launcher are you using? I'm assuming that you are using
> mpirun_rsh. If so, your hostfile ordering seems proper to launch the job
> with a block ordering of ranks.
>
> Do you have password-less ssh set up between the node you are launching
> your MPI job from and all the nodes specified in your hostfile?
>
> What application are you trying to run here? Can you try running a simple
> application such as 'cpi' available under the examples sub-directory of
> the MVAPICH2 source files. This will allow us to eliminate application
> errors.
>
> Also, I just saw that you'd mentioned 'MVAPICH' in the subject line. This
> is an older release which does not have a lot of the latest bug fixes and
> performance enhancements we have been doing. We recommend that you switch
> to MVAPICH2 to get the best performance and the latest bug fixes. You can
> download the tarball containing the latest release of MVAPICH2 from the
> following site
>
> http://mvapich.cse.ohio-state.edu/download/mvapich2/
>
> Thx,
> Hari.
>
>
> On Fri, 8 Apr 2011, vaibhav dutt wrote:
>
> > Hi,
> >
> > I tried to change the hostfile in this manner but I am getting an error
> like
> >
> >
> >  child_handler: Error in init phase...wait for cleanup! (0/5mpispawn
> > connections)
> > Failed in initilization phase, cleaned up all the mpispawn!
> >
> >
> > I am trying to run 32 processes across 4 nodes and my hostfile looks like
> >
> >
> > compute-0-0.local
> > compute-0-0.local
> > compute-0-0.local
> > compute-0-0.local
> > compute-0-0.local
> > compute-0-0.local
> > compute-0-0.local
> > compute-0-0.local
> > compute-0-1.local
> > compute-0-1.local
> > compute-0-1.local
> > compute-0-1.local
> > compute-0-1.local
> > compute-0-1.local
> > compute-0-1.local
> > compute-0-1.local
> > compute-0-2.local
> > compute-0-2.local
> > compute-0-2.local
> > compute-0-2.local
> > compute-0-2.local
> > compute-0-2.local
> > compute-0-2.local
> > compute-0-2.local
> > compute-0-3.local
> > compute-0-3-local
> > compute-0-3.local
> > compute-0-3.local
> > compute-0-3.local
> > compute-0-3.local
> > compute-0-3.local
> > compute-0-3.local
> >
> > Thanks
> > On Thu, Apr 7, 2011 at 12:56 PM, Hari Subramoni <
> subramon at cse.ohio-state.edu
> > > wrote:
> >
> > > Hi Vaibhav,
> > >
> > > To get a block allocation just repeat n1 8 times then n2 8 times in
> your
> > > hostfile. This will change the allocation to block.
> > >
> > > 1: n1
> > > 2: n1
> > > ..
> > > 8: n1
> > > 1: n2
> > > 2: n2
> > > ..
> > > 8: n2
> > >
> > > Please refer to the MVAPICH2 userguide at the following link for more
> > > information
> > >
> > >
> > >
> http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.6.html#x1-240005.2.1
> > >
> > > Thx,
> > > Hari.
> > >
> > >
> > > On Thu, 7 Apr 2011, vaibhav dutt wrote:
> > >
> > > > Hi,
> > > >
> > > > I am trying to execute a job  of 16 processes across two nodes and my
> > > > hostfile looks like
> > > >
> > > > n1:8
> > > > n2:8.
> > > >
> > > > I notice that the ranks   0,2,4,6,8,10,12,14 are on node 1 and
> > > > 1,3,5,7,9,11,13,15 are on node 2 which seems to be
> > > > cyclic distribution. Is this the default rank placement strategy in
> > > MVAPICH
> > > > as I did not give any special option on runtime
> > > > for the rank distribution to be cyclic. How to change it to block?
> > > >
> > > > Thanks
> > > >
> > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20110411/b843979a/attachment-0001.html


More information about the mvapich-discuss mailing list