[mvapich-discuss] Deadlock when using derived datatypes

Moritz Hanke hanke at dkrz.de
Mon Mar 25 07:00:19 EDT 2019


Hello,

Based on an issue observed in the ICON-ESM [1], I generated a test 
program [2] that exhibits a deadlock every couple of runs. The deadlock 
could be reproduced with MVAPICH2 2.1, 2.3.0, and 2.3.1, but not with 
version 1.9b. The issue seems to be caused by extensive use MPI derived 
data types.

The issue does not occur with Intel MPI or OpenMPI and the compiler 
being used to build the test program and MVAPICH does not have an 
influence on the issue.

If the respective environment variables are set or the Makefile is 
adjusted appropriately, the test can be run as follows:
 > make
 > mpirun -n 24 ./test_mvapich_redist

Best regards,
Moritz Hanke

[1]: https://www.mpimet.mpg.de/en/science/models/icon-esm/
[2]: https://www.dropbox.com/s/cjlt33vr5kjztmb/mvapich_redist.tar.gz?dl=1

System information:
* two Intel Xeon E5-2680v3 12C 2.5GHz
* Red Hat Enterprise Linux Server release 6.10 (Santiago)
* Linux 2.6.32-696.18.7.el6.x86_64.

MPI information:
MVAPICH2 1.9b Wed Feb 27 19:12:19 EST 2013 ch3:mrail

Compilation
CC: /sw/rhel6-x64/intel/intel-14.0.3/bin/icc  -O2 -xHost -DNDEBUG 
-DNVALGRIND -O2
CXX: /sw/rhel6-x64/intel/intel-14.0.3/bin/icpc  -O2 -xHost -DNDEBUG 
-DNVALGRIND -O2
F77: /sw/rhel6-x64/intel/intel-14.0.3/bin/ifort -L/lib -L/lib  -O2 
-xHost -O2
FC: /sw/rhel6-x64/intel/intel-14.0.3/bin/ifort  -O2 -xHost -O2

Configuration
--prefix=/sw/rhel6-x64/mpi/mvapich2-1.9b-intel14 --with-device=ch3:mrail 
--with-rdma=gen2 --enable-f77 --enable-fc --enable-cxx --with-pmi=simple

---------------------------------------------------------------------------------

MVAPICH2 2.1 Fri Apr 03 20:00:00 EDT 2015 ch3:mrail

Compilation
CC: /sw/rhel6-x64/intel/intel-14.0.3/bin/icc  -O2 -xHost -DNDEBUG 
-DNVALGRIND -O2
CXX: /sw/rhel6-x64/intel/intel-14.0.3/bin/icpc  -O2 -xHost -DNDEBUG 
-DNVALGRIND -O2
F77: /sw/rhel6-x64/intel/intel-14.0.3/bin/ifort -L/lib -L/lib  -O2 
-xHost -O2
FC: /sw/rhel6-x64/intel/intel-14.0.3/bin/ifort  -O2 -xHost -O2

Configuration
--prefix=/sw/rhel6-x64/mpi/mvapich2-2.1-intel14 --with-device=ch3:mrail 
--with-rdma=gen2 --enable-f77 --enable-fc --enable-cxx --with-pmi=simple

---------------------------------------------------------------------------------

MVAPICH2 2.3.1 Fri Mar 1 22:00:00 EST 2019 ch3:mrail

Compilation
CC: gcc    -DNDEBUG -DNVALGRIND -O2
CXX: g++   -DNDEBUG -DNVALGRIND -O2
F77: gfortran -L/lib -L/lib   -O2
FC: gfortran -cpp  -O2

Configuration
--enable-wrapper-rpath --disable-shared --enable-fortran=all 
--prefix=/work/k20200/k202077/bin/mvapich2-2.3.1-static-nag60 CC=gcc 
CXX=g++ FC=gfortran FCFLAGS=-cpp



More information about the mvapich-discuss mailing list