[mvapich-discuss] mpirun hang on simple cpi job with mvapich 2.2 and Ubuntu 16.04 with MOFED 4.0
Rick Warner
rick at microway.com
Tue May 23 17:26:22 EDT 2017
Hi all,
I'm having some strange behavior with mvapich 2.2 on a small Ubuntu
16.04 cluster. The cluster has ConnectX4 EDR IB HCAs in every node.
The compute nodes have (9) Geforce 1080s each. They're named master and
node2 through node5.
I've installed MOFED 4.0 on the cluster to begin with. OpenMPI from
that works fine. CUDA8 is also installed
I first installed mvapich2-gdr, but when I tried running an example job
(basic cpi test) it hung. I then did some reading that indicated
mvapich2-gdr was just for Tesla/Quadro, and not for Geforce, so I
removed mvapich2-gdr and build regular mvapich2 from source instead. Is
that true? Should I be using the gdr build with Geforce cards?
With the copy I build from source, I reproduced the same hang running a
basic 2 process job on 2 of the compute nodes. However, I found that if
I use the master as 1 of the 2 systems, the job works fine (I hadn't
tried this with gdr before removing, might have been the same there).
It only fails if I use 2 (or more) different computes nodes together.
It also works if I send 2 processes to the same node.
microway at master:~$ mpirun -np 2 --host master,node2 -env MV2_USE_CUDA 0
./cpi-mvapich2
NVIDIA: no NVIDIA devices found
Process 0 of 2 on master
Process 1 of 2 on node2
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 1.092004
*******WORKED*******
microway at master:~$ mpirun -np 2 --host master,node3 -env MV2_USE_CUDA 0
./cpi-mvapich2
NVIDIA: no NVIDIA devices found
Process 0 of 2 on master
Process 1 of 2 on node3
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.820147
*******WORKED*******
microway at master:~$ mpirun -np 2 --host node2,node2 -env MV2_USE_CUDA 0
./cpi-mvapich2
Process 0 of 2 on node2
Process 1 of 2 on node2
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.005124
*******WORKED*******
microway at master:~$ mpirun -np 2 --host node2,node3 -env MV2_USE_CUDA 0
./cpi-mvapich2
*******HANGS HERE - NEVER RETURNS UNTIL CTRL-C*******
I'm using the MV2_USE_CUDA environment variable because the master does
not have cuda devices.
However, mpirun_rsh works:
microway at master:~$ mpirun_rsh -np 2 node2 node3 MV2_USE_CUDA=0
./cpi-mvapich2
Process 0 of 2 on node2
Process 1 of 2 on node3
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.128403
This isn't making sense to me. The debugging I've done so far with
strace and gdb has revealed rank 0 is waiting around line 1630 of
src/mpid/ch3/channels/mrail/src/rdma/ch3_smp_progress.c in the function
MPIDI_CH3I_CM_SHMEM_Sync. Here is a backtrace I created by sending a
SIGSEGV to the process:
microway at master:~$ mpirun -np 2 --host node2,node3 -env MV2_USE_CUDA 0
./cpi-mvapich2
[node2:9777 :0] Caught signal 11 (Segmentation fault)
==== backtrace ====
0 /opt/mellanox/mxm/lib/libmxm.so.2(+0x3c69c) [0x7fab0802f69c]
1 /lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7fab0ad944b0]
2
/usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3I_CM_SHMEM_Sync+0x86)
[0x7fab0b5c6e7b]
3
/usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3I_CM_Create_region+0x280)
[0x7fab0b5c73ff]
4
/usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3I_MRAIL_CM_Alloc+0x2c)
[0x7fab0b5e3883]
5
/usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3_Init+0x638)
[0x7fab0b5b2c3d]
6
/usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPID_Init+0x323)
[0x7fab0b59abf0]
7
/usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIR_Init_thread+0x411)
[0x7fab0b48fb01]
8
/usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPI_Init+0x19a)
[0x7fab0b48ea49]
9 ./cpi-mvapich2() [0x400aed]
10 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)
[0x7fab0ad7f830]
11 ./cpi-mvapich2() [0x400989]
===================
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 9777 RUNNING AT node2
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:0:1 at node3] HYD_pmcd_pmip_control_cmd_cb
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:1 at node3] HYDT_dmxu_poll_wait_for_event
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:1 at node3] main (pm/pmiserv/pmip.c:206): demux engine error
waiting for event
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
(signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
Here is mpiexec -info:
microway at master:~$ mpiexec -info
HYDRA build details:
Version: 3.1.4
Release Date: Wed Sep 7 14:33:43 EDT 2016
CC: gcc
CXX: g++
F77: gfortran
F90: gfortran
Configure options: '--disable-option-checking'
'--prefix=/usr/local/mpi/gcc/mvapich2-2.2' '--localstatedir=/var'
'--disable-static' '--enable-shared' '--with-mxm=/opt/mellanox/mxm'
'--with-hcoll=/opt/mellanox/hcoll' '--with-knem=/opt/knem-1.1.2.90mlnx1'
'--without-slurm' '--disable-mcast' '--without-cma'
'--without-hydra-ckpointlib' '--enable-g=dbg' '--enable-cuda'
'--with-cuda=/usr/local/cuda' '--enable-fast=ndebug'
'--cache-file=/dev/null' '--srcdir=.' 'CC=gcc' 'CFLAGS= -DNDEBUG
-DNVALGRIND -g' 'LDFLAGS=-L/usr/local/cuda/lib64 -L/usr/local/cuda/lib
-L/lib -L/lib -L/opt/mellanox/hcoll/lib64 -L/opt/mellanox/hcoll/lib
-L/lib -Wl,-rpath,/lib -L/lib -Wl,-rpath,/lib -L/lib -L/lib'
'LIBS=-lcudart -lcuda -lrdmacm -libumad -libverbs -ldl -lrt -lm
-lpthread ' 'CPPFLAGS=-I/usr/local/cuda/include
-I/opt/mellanox/hcoll/include
-I/mcms/build/mvapich/source/mvapich2-2.2/src/mpl/include
-I/mcms/build/mvapich/source/mvapich2-2.2/src/mpl/include
-I/mcms/build/mvapich/source/mvapich2-2.2/src/openpa/src
-I/mcms/build/mvapich/source/mvapich2-2.2/src/openpa/src -D_REENTRANT
-I/mcms/build/mvapich/source/mvapich2-2.2/src/mpi/romio/include
-I/include -I/include -I/include -I/include'
Process Manager: pmi
Launchers available: ssh rsh fork slurm ll lsf
sge manual persist
Topology libraries available: hwloc
Resource management kernels available: user slurm ll lsf sge pbs
cobalt
Checkpointing libraries available:
Demux engines available: poll select
If there is any other needed info please let me know.
Thanks,
Rick
More information about the mvapich-discuss
mailing list