[mvapich-discuss] Re: problem run mvapich1.0 on AMD quad core cluster

LEI CHAI chai.15 at osu.edu
Thu May 29 13:10:50 EDT 2008


For everybody's information, the problem was solved off-line. It was due to IP setup issues on the machines. I'm attaching the latest correspondance below:

On Thu, 29 May 2008, Lei Chai wrote:

> Glad to know mvapich works for you with this workaround. I cannot think of
> any places that this change will break. I think it should work fine.
>
> Lei
>
>
> On Thu, 29 May 2008 Terrence.LIAO at total.com wrote:
>
> > Hi, Lei,
> >
> > I have one final question on get_host_id().    Problem we have come from
> > hostname "nod200" for example on our compute nodes has local IP according
> > to /etc/hosts file.  However, the IB interface has name "nib200", since
> > "nib200" is not listed on /etc/hosts, it get the IP from NIS.  I have
> > modified the get_host_id() by doing  myhostname[1]='i'; myhostname[2]='b',
> > to convert myhostname from nodXXX to nibXXX,  since gethostname() return
> > nodXXX.  With this small change, the mpi-code can run.  My question is
> > this modification break mvapich in some other location?
> >
> > Thank you very much.
> >
> >
> > P.S. We keep away from /etc/hosts in order to avoid manually editing it
> > which might introduce potential human error into the /etc/hosts.
> >
> > -- Terrence
> > --------------------------------------------------------
> > Terrence Liao, Ph.D.
> > Research Computer Scientist
> > TOTAL E&P RESEARCH & TECHNOLOGY USA, LLC
> > 1201 Louisiana, Suite 1800, Houston, TX 77002
> > Tel: 713.647.3498  Fax: 713.647.3638
> > Email: terrence.liao at total.com


----- Original Message -----
From: Terrence.LIAO at total.com
Date: Friday, May 23, 2008 8:22 am
Subject: [mvapich-discuss] Re: problem run mvapich1.0 on AMD quad core	cluster
To: Terrence.LIAO at total.com
Cc: mvapich-discuss at cse.ohio-state.edu

 
> Hi, mvapich-discuss 
 
> I find a work around on my mvapich problem.  The build with  -D_SMP_ and -D_SMP_RNDV flags removed from make.mvapich.gen2 works on my cluster.   
 
> Thank you very much.
 
>  -- Terrence
>  --------------------------------------------------------
>  Terrence Liao, Ph.D.
>  Research Computer Scientist
>  TOTAL E&P RESEARCH & TECHNOLOGY USA, LLC
>  1201 Louisiana, Suite 1800, Houston, TX 77002 
>  Tel: 713.647.3498  Fax: 713.647.3638
>  Email: terrence.liao at total.com
 
 
  
 
 
   > Terrence LIAO/HOU/US/EP/Corp 05/21/2008 03:34 PM     > To > mvapich-discuss at cse.ohio-state.edu   > cc > Terrence LIAO/HOU/US/EP/Corp at E&P   > Subject > problem run mvapich1.0  on AMD quad core clusterLink 
     
 
 
> Hi, mvapich-discuss, 
 
> We have a AMD quad core cluster with RHEL 4.5, OFED 1.2.5 with PGI compiler.  I build mvapich1.0 as usually, but mpi hello world hung at MPI_INIT().  I can run with OpenMPI build.  Also, the ib_write_bw and ibv_rc_pingpong ran fine.  Any clue where I did wrong? 
 
> Thank you very much.
 
>  -- Terrence
>  --------------------------------------------------------
>  Terrence Liao, Ph.D.
>  Research Computer Scientist
>  TOTAL E&P RESEARCH & TECHNOLOGY USA, LLC
>  1201 Louisiana, Suite 1800, Houston, TX 77002 
>  Tel: 713.647.3498  Fax: 713.647.3638
>  Email: terrence.liao at total.com
 
 
  
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080529/93df6376/attachment.html


More information about the mvapich-discuss mailing list