[mvapich-discuss] Query related to running MVAPICH over OFED

Dhabaleswar Panda panda at cse.ohio-state.edu
Sun Aug 19 09:52:54 EDT 2007


Hi, 

> I have tested mvapich2-0.9.8p2 with OFED1.1-gen2. Its working fine.
> I have tested mvapich2-0.9.8p2 with OFED1.2-gen2. Its also working fine.
> I have tested mvapich2-1.0-beta with OFED1.2-gen2. Its also working fine.
> But again, mvapich2-1.0-beta with OFED1.2-uDAPL(OpenIB-cma), its
> failing with error:-
> 
> [rdma_udapl_priv.c:837] error(-2147287038): Could not create EP
> [rdma_udapl_priv.c:837] error(-2147287038): Could not create EP
> rank 1 in job 1  in06_32868   caused collective abort of all ranks
>   exit status of rank 1: return code 1
> rank 0 in job 1  in06_32868   caused collective abort of all ranks
>   exit status of rank 0: return code 1
> 
> Till now, i was not using the MVAPICH2 which is shipped with OFED1.2,
> I was compiling MVAPICH2 externally. But then i tried
> MVAPICH2-0.9.8-12 which is shipped with OFED1.2  & its working fine.
> Only when i compile MVAPICH2 externally, the problem seems to be
> occurring (i think)? Any comments ?

Thanks for letting us know the status of your testing for different
MVAPICH2 versions. Good to know that MVAPICH2-0.9.8-12 shipped with
OFED 1.2 is working fine for you. I am assuming this is working with
the uDAPL interface.

We will take a look at it why there is a compilation problem when you
use MVAPICH2 externally and get back to you soon.

> What is "-12" from MVAPICH2-0.9.8-12 ? Some patched version?

As you might have noticed, OFED releases typically go through multiple
RC versions. As these testings continue and problems come up, we have
been updating MVAPICH2-0.9.8 with corresponding fixes and different
suffixes (-1, -2, ..., -12).

>From MPI code perspective, MVAPICH2-0.9.8-12 is equivalent to the
latest MVAPICH2-0.9.8p3 (available from mvapich web site). The OFED
version has additional stuff for building it in an integrated manner
with other components.

> I have not tried the things with libdaplscm.so till now.
> 
> Yogeshwar
> p.s:- As per ur suggestion, i am using now OFED1.2 & MVAPICH2-1.0-beta

Thanks. 

DK


> On 8/18/07, LEI CHAI <chai.15 at osu.edu> wrote:
> > Hi,
> >
> > Could you try the following things:
> >
> > Could you check if gen2 works fine. You can use make.mvapich2.ofa script to build mvapich2 with gen2.
> >
> > After making sure gen2 works, if you want to upgrade your OFED version to 1.2 or the latest 1.2.5 release, we are sure mvapich2 will work with OpenIB-cma provider. (recommended)
> >
> > If you want to stick to OFED1.1, could you use libdaplscm.so instead of libdaplcma.so in your /etc/dat.conf file.
> >
> > And finally (not related to this problem), please try our latest MVAPICH2-1.0 beta release if you are interested :-)
> >
> > Lei
> >
> >
> > ----- Original Message -----
> > From: yogeshwar sonawane <yogyas at gmail.com>
> > Date: Friday, August 17, 2007 5:56 am
> > Subject: Re: [mvapich-discuss] Query related to running MVAPICH over OFED
> >
> > > Yes, the uDAPL-level tests with OFED 1.1-uDAPL installation are
> > > working fine.
> > > I am able to create EPs, transfer data, etc.
> > > But with MPI, i am getting this error.
> > >
> > > I am using MVAPICH2-0.9.8p2 with OpenIB-cma as uDAPL provider name.
> > > I have OFED1.1 installed.
> > >
> > > On 8/16/07, Dhabaleswar Panda <panda at cse.ohio-state.edu> wrote:
> > > > On your system, are you able to run basic uDAPL-level tests with
> > > OFED> 1.1-uDAPL installation? It will be good if you try this
> > > first to make
> > > > sure that uDAPL installation is correct. Then you can put MPI on top
> > > > of this and carry out MPI-level tests and performance evaluation.
> > > >
> > > > DK
> > > >
> > > >
> > > > > Hello,
> > > > > I tried it. but i am getting following error when i run cpi
> > > > > application with 2 processes:-
> > > > >
> > > > > [rdma_udapl_priv.c:833] error(-2147287038): Could not create EP
> > > > > [rdma_udapl_priv.c:830] error(-2147287038): Could not create EP
> > > > > rank 1 in job 2  in06_32882   caused collective abort of all ranks
> > > > >   exit status of rank 1: return code 1
> > > > > rank 0 in job 2  in06_32882   caused collective abort of all ranks
> > > > >   exit status of rank 0: return code 1
> > > > >
> > > > > any help?
> > > > > Thanks,
> > > > > Yogeshwar
> > > > >
> > > > > On 8/16/07, yogeshwar sonawane <yogyas at gmail.com> wrote:
> > > > > > Thanks for help.
> > > > > > I will try it.
> > > > > >
> > > > > > Yogeshwar
> > > > > >
> > > > > > On 8/16/07, Dhabaleswar Panda <panda at cse.ohio-state.edu> wrote:
> > > > > > > > Hi all,
> > > > > > > > Usually, to run MVAPICH over OFED, make.mvapich2.ofa is
> > > used. After
> > > > > > > > successful compilation, MVAPICH will use "OpenFabrics
> > > Gen2-IB" as
> > > > > > > > underlying transport interfaces.
> > > > > > > > This i have tried & is running fine.
> > > > > > > >
> > > > > > > > Now as OFED contains dapl component, so can uDAPL
> > > interfaces be used
> > > > > > > > to run MVAPICH over OFED ?
> > > > > > > > OR
> > > > > > > > After compiling MVAPICH with make.mvapich2.udapl, will
> > > it work using
> > > > > > > > "uDAPL" as underlying transport interfaces provided by
> > > OFED ?
> > > > > > >
> > > > > > > Yes, this will work. The uDAPL support in MVAPICH/MVAPICH2
> > > works well
> > > > > > > with any uDAPL layer (including that of OFED). In fact,
> > > during every
> > > > > > > release, we carry out extensive test of the uDAPL
> > > interface over OFED
> > > > > > > uDAPL.
> > > > > > >
> > > > > > > You can also find this information in the user guides
> > > (available from
> > > > > > > mvapich web site).
> > > > > > >
> > > > > > > > If anybody has tried this before, can help me.
> > > > > > > >
> > > > > > > > For info:- I am using MVAPICH2-0.9.8/ MVAPICH2-1.0 with
> > > OFED 1.1 on a
> > > > > > > > infiniband card.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > DK
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Yogeshwar
> > > > > > > > _______________________________________________
> > > > > > > > mvapich-discuss mailing list
> > > > > > > > mvapich-discuss at cse.ohio-state.edu
> > > > > > > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-
> > > discuss> > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > _______________________________________________
> > > mvapich-discuss mailing list
> > > mvapich-discuss at cse.ohio-state.edu
> > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > >
> >
> >
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 



More information about the mvapich-discuss mailing list