[mvapich-discuss] MVAPICH2 not working with latest Intel 12 compiler on Mellanox IB Gen2

Dhabaleswar Panda panda at cse.ohio-state.edu
Tue Apr 24 10:20:28 EDT 2012


Hi Mohammad - Glad to know that the problem got resolved with the latest
version of Intel compiler. Let us know if you enounter any additional
issues.

Hi Dan - Thanks for your pointer also. You may also upgrade to the latest
Intel compiler (as Mohammad has indicated) and things should be working
without any special options.

Thanks,

DK

On Tue, 24 Apr 2012, sindimo at gmail.com wrote:

> Dear Dr. Panda,
>
>                       Thank you for your quick response, we truly
> appreciate it.
>
> Yes it seems that the issue is related to the link you provided, we were
> running on Intel 12.1.0 20110811 which has the bug.
>
> I've tried using '--disable-fast' and MPICH2LIB_CFLAGS="-O0" and that did
> work, however I am not sure how much of an impact on performance this will
> cause.
>
> To be on the safe side we upgraded to the latest Intel 12.1 Build 20120212
> and that fixed the problem.
>
> Just a side note for others who might have this problem, as of today if you
> download the latest Intel Cluster Studio XE 2012, the compiler version
> bundled in that suite is out of date and has the bug in it (Build 20110811)
> even though it says it's the latest from the Intel website.
>
> To get the latest compiler we had to download the Intel C++ Composer XE and
> Intel Fortran Composer XE individually which are up to date and had the fix
> (Build 20120212).
> I hope that helps and thank you again for your valuable feedback and
> support.
>
> Mohammad Sindi
> HPC Group
> EXPEC Computer Center
> Saudi Aramco
>
>
>
> On Tue, Apr 24, 2012 at 6:33 AM, Dhabaleswar Panda <panda at cse.ohio-state.edu
> > wrote:
>
> > Hi Mohammad,
> >
> > Thanks for your note. Glad to know that you are moving from MVAPICH1 to
> > MVAPICH2.
> >
> > Regarding the Intel compiler issue, which exact version of Intel 12
> > compiler you are using.
> >
> > A few months back, a similar posting was made on mvapich-discuss list.
> > The poster also indicated that later versions of Intel 12.1 compiler
> > didn't have this issue.
> >
> >
> > http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2011-November/003642.html
> >
> > Please take a look at this posting and see if any later version of the
> > compiler helps to resolve the issue you are seeing.
> >
> > Regarding the configuration flags, there is no need to use
> > --enable-threads and --enable-smpcoll. You can drop these.
> >
> > Let us know if these suggestions help to resolve the issues you are
> > seeing.
> >
> > Thanks,
> >
> > DK
> >
> >
> > On Mon, 23 Apr 2012, sindimo at gmail.com wrote:
> >
> > > Dear Support,
> > >
> > >                    We are trying to get MVAPICH2 to work with the Intel
> > 12
> > > compiler on an older cluster having Mellanox IB Gen2. This seems to work
> > > fine with the older Intel 10 compiler but not with Intel 12. It actually
> > > builds but we are having issues when running the application.
> > >
> > > The configuration we are using:
> > > export CC=${CC:-/usr/local/intel/ics12/ics12/bin/icc}
> > > export CXX=${CXX:-/usr/local/intel/ics12/ics12/bin/icpc}
> > > export F77=${F77:-/usr/local/intel/ics12/ics12/bin/ifort}
> > > export FC=${FC:-/usr/local/intel/ics12/ics12/bin/ifort}
> > >
> > > ./configure --prefix=/usr/local/mpi/mvapich2/intel10/1.7 --enable-g=dbg
> > > --enable-debuginfo --enable-romio  --with-file-system=panfs+nfs+ufs
> > > --with-rdma=gen2 --with-ib-libpath=/usr/lib64 --enable-threads
> > > --enable-smpcoll
> > >
> > > make
> > > make install
> > >
> > > The PATH and LD_LIBRARY_PATH are all clean and setup properly.
> > >
> > > We use the below to launch the job:
> > > mpirun_rsh -n 4 -hostfile ./nodes myapp.exe
> > >
> > > With MVAPICH2 1.7 + Intel 10 it runs fine.
> > >
> > > With MVAPICH2 1.7 + Intel 12 we get the below error immediately after we
> > > lunch the job:
> > >
> > > [plcf023:mpispawn_0][child_handler] MPI process (rank: 0, pid: 15976)
> > > terminated with signal 11 -> abort job
> > > [plcf:mpirun_rsh][process_mpispawn_connection] mpispawn_0 from node
> > plcf023
> > > aborted: MPI process error (1)
> > >
> > >
> > > We tried setting MV2_DEBUG_SHOW_BACKTRACE but it's not showing anything
> > > more than the above error.
> > >
> > > We also tried MVAPICH2 1.8rc1 with Intel 12 and it's showing the same
> > > problem.
> > >
> > > We ran out of ideas to trouble shoot this and we would appreciate any
> > help
> > > with it.
> > >
> > > Also for reference, we have newer clusters with Qlogic IB with PSM and
> > the
> > > MVAPICH2 + Intel 12 combination runs fine there, so it seems more of an
> > > issue with the Mellanox Gen 2 IB. Obviously with the PSM clusters we
> > > configure with "--with-device=ch3:psm   --with-psm-include=/usr/include
> > > --with-psm=/usr/lib64".
> > >
> > > This is part of an activity we are doing in our HPC computer center to
> > move
> > > from MVAPICH1 to MVAPICH2 and currently this is the only show stopper we
> > > have.
> > >
> > > Thank you for your help.
> > >
> > > Mohammad Sindi
> > > HPC Group
> > > EXPEC Computer Center
> > > Saudi Aramco
> > >
> >
> >
>



More information about the mvapich-discuss mailing list