[mvapich-discuss] Re: problem run mvapich1.0 on AMD quad
core cluster
LEI CHAI
chai.15 at osu.edu
Thu May 29 13:10:50 EDT 2008
For everybody's information, the problem was solved off-line. It was due to IP setup issues on the machines. I'm attaching the latest correspondance below:
On Thu, 29 May 2008, Lei Chai wrote:
> Glad to know mvapich works for you with this workaround. I cannot think of
> any places that this change will break. I think it should work fine.
>
> Lei
>
>
> On Thu, 29 May 2008 Terrence.LIAO at total.com wrote:
>
> > Hi, Lei,
> >
> > I have one final question on get_host_id(). Problem we have come from
> > hostname "nod200" for example on our compute nodes has local IP according
> > to /etc/hosts file. However, the IB interface has name "nib200", since
> > "nib200" is not listed on /etc/hosts, it get the IP from NIS. I have
> > modified the get_host_id() by doing myhostname[1]='i'; myhostname[2]='b',
> > to convert myhostname from nodXXX to nibXXX, since gethostname() return
> > nodXXX. With this small change, the mpi-code can run. My question is
> > this modification break mvapich in some other location?
> >
> > Thank you very much.
> >
> >
> > P.S. We keep away from /etc/hosts in order to avoid manually editing it
> > which might introduce potential human error into the /etc/hosts.
> >
> > -- Terrence
> > --------------------------------------------------------
> > Terrence Liao, Ph.D.
> > Research Computer Scientist
> > TOTAL E&P RESEARCH & TECHNOLOGY USA, LLC
> > 1201 Louisiana, Suite 1800, Houston, TX 77002
> > Tel: 713.647.3498 Fax: 713.647.3638
> > Email: terrence.liao at total.com
----- Original Message -----
From: Terrence.LIAO at total.com
Date: Friday, May 23, 2008 8:22 am
Subject: [mvapich-discuss] Re: problem run mvapich1.0 on AMD quad core cluster
To: Terrence.LIAO at total.com
Cc: mvapich-discuss at cse.ohio-state.edu
> Hi, mvapich-discuss
> I find a work around on my mvapich problem. The build with -D_SMP_ and -D_SMP_RNDV flags removed from make.mvapich.gen2 works on my cluster.
> Thank you very much.
> -- Terrence
> --------------------------------------------------------
> Terrence Liao, Ph.D.
> Research Computer Scientist
> TOTAL E&P RESEARCH & TECHNOLOGY USA, LLC
> 1201 Louisiana, Suite 1800, Houston, TX 77002
> Tel: 713.647.3498 Fax: 713.647.3638
> Email: terrence.liao at total.com
> Terrence LIAO/HOU/US/EP/Corp 05/21/2008 03:34 PM > To > mvapich-discuss at cse.ohio-state.edu > cc > Terrence LIAO/HOU/US/EP/Corp at E&P > Subject > problem run mvapich1.0 on AMD quad core clusterLink
> Hi, mvapich-discuss,
> We have a AMD quad core cluster with RHEL 4.5, OFED 1.2.5 with PGI compiler. I build mvapich1.0 as usually, but mpi hello world hung at MPI_INIT(). I can run with OpenMPI build. Also, the ib_write_bw and ibv_rc_pingpong ran fine. Any clue where I did wrong?
> Thank you very much.
> -- Terrence
> --------------------------------------------------------
> Terrence Liao, Ph.D.
> Research Computer Scientist
> TOTAL E&P RESEARCH & TECHNOLOGY USA, LLC
> 1201 Louisiana, Suite 1800, Houston, TX 77002
> Tel: 713.647.3498 Fax: 713.647.3638
> Email: terrence.liao at total.com
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080529/93df6376/attachment.html
More information about the mvapich-discuss
mailing list