[mvapich-discuss] mvapich2 1.8.1 and gcc 4.7.2 problem

Carmelo Ponti (CSCS) cponti at cscs.ch
Tue Feb 12 08:55:06 EST 2013


Hello

I compiled mvapich2 1.8.1 with gcc 4.7.2 and slurm 2.3.4 as follow:

./configure --prefix=/apps/pilatus/mvapich2/1.8.1/gcc-4.7.2
--enable-threads=default --enable-shared --enable-sharedlibs=gcc
--enable-fc --with-mpe --enable-rsh --enable-rdma-cm --enable-fast
--enable-smpcoll --with-hwloc --enable-xrc --with-device=ch3:mrail
--with-rdma=gen2 --enable-g=dbg --enable-debuginfo --with-limic2 CC=gcc
CXX=g++ FC=gfortran F77=gfortran --with-pmi=slurm --with-pm=no
--with-slurm=/apps/pilatus/slurm/default/
CPPFLAGS=-I/apps/pilatus/slurm/default/include
LDFLAGS=-L/apps/pilatus/slurm/default/lib

but if I try a simple hello world mpi program I got:

In: PMI_Abort(1, Fatal error in MPI_Init:
Other MPI error
)
In: PMI_Abort(1, Fatal error in MPI_Init:
Other MPI error
)
In: PMI_Abort(1, Fatal error in MPI_Init:
Other MPI error
)
In: PMI_Abort(1, Fatal error in MPI_Init:
Other MPI error
)
slurmd[pilatus19]: *** STEP 40910.0 KILLED AT 12:01:02 WITH SIGNAL 9 ***
slurmd[pilatus21]: *** STEP 40910.0 KILLED AT 12:01:02 WITH SIGNAL 9 ***
slurmd[pilatus20]: *** STEP 40910.0 KILLED AT 12:01:02 WITH SIGNAL 9 ***
...

The problem appears only if I use more than 2 nodes.

I compiled the same version of mvapich2 with intel 13.0.1 and pgi 13.1
and everything is working fine.

I recompiled mvapich2 1.8.1/gcc 4.7.2 with --disable-fast and
--enable-g=dbg and then the problem disappear.

I recompiled it with --enable-g=dbg but I didn't get more information
than this:

In: PMI_Abort(1, Fatal error in MPI_Init:
Other MPI error
)
In: PMI_Abort(1, Fatal error in MPI_Init:
Other MPI error
)
slurmd[pilatus21]: *** STEP 40936.0 KILLED AT 14:49:01 WITH SIGNAL 9 ***

Please let me know if you need more information.

Thank you in advance for your help
Carmelo Ponti

-- 
----------------------------------------------------------------------
Carmelo Ponti           System Engineer                             
CSCS                    Swiss Center for Scientific Computing 
Via Trevano 131         Email: cponti at cscs.ch                  
CH-6900 Lugano          http://www.cscs.ch              
                        Phone: +41 91 610 82 15/Fax: +41 91 610 82 82
----------------------------------------------------------------------



More information about the mvapich-discuss mailing list