[mvapich-discuss] mpirun hang on simple cpi job with mvapich 2.2 and Ubuntu 16.04 with MOFED 4.0

Rick Warner rick at microway.com
Thu May 25 17:33:33 EDT 2017


FYI - I tried removing all the GPUs from node2 to eliminate that as a 
possible problem since master (no GPUs) + any compute node was working.

I am still hanging with node2+node3 running the job despite the GPUs 
being removed from node2.

Thanks,
Rick

On 05/24/17 10:51, Rick Warner wrote:
> Thanks for the response Hari. I appreciate the help.
>
>
> I extracted the 2.2 tarball and ran configure with the listed options 
> in the below mpiname -a output:
>
> root at master:/mcms/build/mvapich/source/mvapich2-2.2# mpiname -a
> MVAPICH2 2.2 Thu Sep 08 22:00:00 EST 2016 ch3:mrail
>
> Compilation
> CC: gcc    -DNDEBUG -DNVALGRIND -g
> CXX: g++   -DNDEBUG -DNVALGRIND -g
> F77: gfortran -L/lib -L/lib   -g
> FC: gfortran   -g
>
> Configuration
> --prefix=/usr/local/mpi/gcc/mvapich2-2.2 --localstatedir=/var 
> --disable-static --enable-shared --with-mxm=/opt/mellanox/mxm 
> --with-hcoll=/opt/mellanox/hcoll --with-knem=/opt/knem-1.1.2.90mlnx1 
> --without-slurm --disable-mcast --without-cma 
> --without-hydra-ckpointlib --enable-g=dbg --enable-cuda 
> --with-cuda=/usr/local/cuda --enable-fast=ndebug
>
>
> ibstat from every system:
> master: CA 'mlx4_0'
> master:         CA type: MT4103
> master:         Number of ports: 2
> master:         Firmware version: 2.40.7000
> master:         Hardware version: 0
> master:         Node GUID: 0x248a070300e883f0
> master:         System image GUID: 0x248a070300e883f0
> master:         Port 1:
> master:                 State: Down
> master:                 Physical state: Disabled
> master:                 Rate: 10
> master:                 Base lid: 0
> master:                 LMC: 0
> master:                 SM lid: 0
> master:                 Capability mask: 0x04010000
> master:                 Port GUID: 0x268a07fffee883f0
> master:                 Link layer: Ethernet
> master:         Port 2:
> master:                 State: Down
> master:                 Physical state: Disabled
> master:                 Rate: 10
> master:                 Base lid: 0
> master:                 LMC: 0
> master:                 SM lid: 0
> master:                 Capability mask: 0x04010000
> master:                 Port GUID: 0x268a07fffee883f1
> master:                 Link layer: Ethernet
> master: CA 'mlx5_0'
> master:         CA type: MT4115
> master:         Number of ports: 1
> master:         Firmware version: 12.18.2000
> master:         Hardware version: 0
> master:         Node GUID: 0x248a070300a2eff0
> master:         System image GUID: 0x248a070300a2eff0
> master:         Port 1:
> master:                 State: Active
> master:                 Physical state: LinkUp
> master:                 Rate: 100
> master:                 Base lid: 1
> master:                 LMC: 0
> master:                 SM lid: 1
> master:                 Capability mask: 0x2651e84a
> master:                 Port GUID: 0x248a070300a2eff0
> master:                 Link layer: InfiniBand
> node2 : CA 'mlx5_0'
> node2 :         CA type: MT4115
> node2 :         Number of ports: 1
> node2 :         Firmware version: 12.18.2000
> node2 :         Hardware version: 0
> node2 :         Node GUID: 0x248a070300a2f0f0
> node2 :         System image GUID: 0x248a070300a2f0f0
> node2 :         Port 1:
> node2 :                 State: Active
> node2 :                 Physical state: LinkUp
> node2 :                 Rate: 100
> node2 :                 Base lid: 5
> node2 :                 LMC: 0
> node2 :                 SM lid: 1
> node2 :                 Capability mask: 0x2651e848
> node2 :                 Port GUID: 0x248a070300a2f0f0
> node2 :                 Link layer: InfiniBand
> node3 : CA 'mlx5_0'
> node3 :         CA type: MT4115
> node3 :         Number of ports: 1
> node3 :         Firmware version: 12.18.2000
> node3 :         Hardware version: 0
> node3 :         Node GUID: 0x248a070300a09ad0
> node3 :         System image GUID: 0x248a070300a09ad0
> node3 :         Port 1:
> node3 :                 State: Active
> node3 :                 Physical state: LinkUp
> node3 :                 Rate: 100
> node3 :                 Base lid: 3
> node3 :                 LMC: 0
> node3 :                 SM lid: 1
> node3 :                 Capability mask: 0x2651e848
> node3 :                 Port GUID: 0x248a070300a09ad0
> node3 :                 Link layer: InfiniBand
> node4 : CA 'mlx5_0'
> node4 :         CA type: MT4115
> node4 :         Number of ports: 1
> node4 :         Firmware version: 12.18.2000
> node4 :         Hardware version: 0
> node4 :         Node GUID: 0x248a070300a2efc8
> node4 :         System image GUID: 0x248a070300a2efc8
> node4 :         Port 1:
> node4 :                 State: Active
> node4 :                 Physical state: LinkUp
> node4 :                 Rate: 100
> node4 :                 Base lid: 2
> node4 :                 LMC: 0
> node4 :                 SM lid: 1
> node4 :                 Capability mask: 0x2651e848
> node4 :                 Port GUID: 0x248a070300a2efc8
> node4 :                 Link layer: InfiniBand
> node5 : CA 'mlx5_0'
> node5 :         CA type: MT4115
> node5 :         Number of ports: 1
> node5 :         Firmware version: 12.18.2000
> node5 :         Hardware version: 0
> node5 :         Node GUID: 0x248a070300a2f0e4
> node5 :         System image GUID: 0x248a070300a2f0e4
> node5 :         Port 1:
> node5 :                 State: Active
> node5 :                 Physical state: LinkUp
> node5 :                 Rate: 100
> node5 :                 Base lid: 6
> node5 :                 LMC: 0
> node5 :                 SM lid: 1
> node5 :                 Capability mask: 0x2651e848
> node5 :                 Port GUID: 0x248a070300a2f0e4
> node5 :                 Link layer: InfiniBand
>
>
> ibv_devinfo from all: (FYI - master has a dual port mellanox ethernet 
> card, currently unused)
> root at master:/mcms/build/mvapich/source/mvapich2-2.2# scom -a ibv_devinfo
> master: hca_id: mlx5_0
> master:         transport:                      InfiniBand (0)
> master:         fw_ver:                         12.18.2000
> master:         node_guid: 248a:0703:00a2:eff0
> master:         sys_image_guid: 248a:0703:00a2:eff0
> master:         vendor_id:                      0x02c9
> master:         vendor_part_id:                 4115
> master:         hw_ver:                         0x0
> master:         board_id:                       MT_2180110032
> master:         phys_port_cnt:                  1
> master:         Device ports:
> master:                 port:   1
> master:                         state: PORT_ACTIVE (4)
> master:                         max_mtu:                4096 (5)
> master:                         active_mtu:             4096 (5)
> master:                         sm_lid:                 1
> master:                         port_lid:               1
> master:                         port_lmc:               0x00
> master:                         link_layer:             InfiniBand
> master:
> master: hca_id: mlx4_0
> master:         transport:                      InfiniBand (0)
> master:         fw_ver:                         2.40.7000
> master:         node_guid: 248a:0703:00e8:83f0
> master:         sys_image_guid: 248a:0703:00e8:83f0
> master:         vendor_id:                      0x02c9
> master:         vendor_part_id:                 4103
> master:         hw_ver:                         0x0
> master:         board_id:                       MT_1200111023
> master:         phys_port_cnt:                  2
> master:         Device ports:
> master:                 port:   1
> master:                         state:                  PORT_DOWN (1)
> master:                         max_mtu:                4096 (5)
> master:                         active_mtu:             1024 (3)
> master:                         sm_lid:                 0
> master:                         port_lid:               0
> master:                         port_lmc:               0x00
> master:                         link_layer:             Ethernet
> master:
> master:                 port:   2
> master:                         state:                  PORT_DOWN (1)
> master:                         max_mtu:                4096 (5)
> master:                         active_mtu:             1024 (3)
> master:                         sm_lid:                 0
> master:                         port_lid:               0
> master:                         port_lmc:               0x00
> master:                         link_layer:             Ethernet
> master:
> node2 : hca_id: mlx5_0
> node2 :         transport:                      InfiniBand (0)
> node2 :         fw_ver:                         12.18.2000
> node2 :         node_guid: 248a:0703:00a2:f0f0
> node2 :         sys_image_guid: 248a:0703:00a2:f0f0
> node2 :         vendor_id:                      0x02c9
> node2 :         vendor_part_id:                 4115
> node2 :         hw_ver:                         0x0
> node2 :         board_id:                       MT_2180110032
> node2 :         phys_port_cnt:                  1
> node2 :         Device ports:
> node2 :                 port:   1
> node2 :                         state: PORT_ACTIVE (4)
> node2 :                         max_mtu:                4096 (5)
> node2 :                         active_mtu:             4096 (5)
> node2 :                         sm_lid:                 1
> node2 :                         port_lid:               5
> node2 :                         port_lmc:               0x00
> node2 :                         link_layer:             InfiniBand
> node2 :
> node3 : hca_id: mlx5_0
> node3 :         transport:                      InfiniBand (0)
> node3 :         fw_ver:                         12.18.2000
> node3 :         node_guid: 248a:0703:00a0:9ad0
> node3 :         sys_image_guid: 248a:0703:00a0:9ad0
> node3 :         vendor_id:                      0x02c9
> node3 :         vendor_part_id:                 4115
> node3 :         hw_ver:                         0x0
> node3 :         board_id:                       MT_2180110032
> node3 :         phys_port_cnt:                  1
> node3 :         Device ports:
> node3 :                 port:   1
> node3 :                         state: PORT_ACTIVE (4)
> node3 :                         max_mtu:                4096 (5)
> node3 :                         active_mtu:             4096 (5)
> node3 :                         sm_lid:                 1
> node3 :                         port_lid:               3
> node3 :                         port_lmc:               0x00
> node3 :                         link_layer:             InfiniBand
> node3 :
> node4 : hca_id: mlx5_0
> node4 :         transport:                      InfiniBand (0)
> node4 :         fw_ver:                         12.18.2000
> node4 :         node_guid: 248a:0703:00a2:efc8
> node4 :         sys_image_guid: 248a:0703:00a2:efc8
> node4 :         vendor_id:                      0x02c9
> node4 :         vendor_part_id:                 4115
> node4 :         hw_ver:                         0x0
> node4 :         board_id:                       MT_2180110032
> node4 :         phys_port_cnt:                  1
> node4 :         Device ports:
> node4 :                 port:   1
> node4 :                         state: PORT_ACTIVE (4)
> node4 :                         max_mtu:                4096 (5)
> node4 :                         active_mtu:             4096 (5)
> node4 :                         sm_lid:                 1
> node4 :                         port_lid:               2
> node4 :                         port_lmc:               0x00
> node4 :                         link_layer:             InfiniBand
> node4 :
> node5 : hca_id: mlx5_0
> node5 :         transport:                      InfiniBand (0)
> node5 :         fw_ver:                         12.18.2000
> node5 :         node_guid: 248a:0703:00a2:f0e4
> node5 :         sys_image_guid: 248a:0703:00a2:f0e4
> node5 :         vendor_id:                      0x02c9
> node5 :         vendor_part_id:                 4115
> node5 :         hw_ver:                         0x0
> node5 :         board_id:                       MT_2180110032
> node5 :         phys_port_cnt:                  1
> node5 :         Device ports:
> node5 :                 port:   1
> node5 :                         state: PORT_ACTIVE (4)
> node5 :                         max_mtu:                4096 (5)
> node5 :                         active_mtu:             4096 (5)
> node5 :                         sm_lid:                 1
> node5 :                         port_lid:               6
> node5 :                         port_lmc:               0x00
> node5 :                         link_layer:             InfiniBand
>
>
> Thanks!
> Rick
>
> On 05/24/2017 08:47 AM, Hari Subramoni wrote:
>
>> Hi Rick,
>>
>>
>> Sorry to hear that you are facing issues. Although we have not tested 
>> with GeForce cards internally, we believe that it will work.
>>
>>
>> We're taking a look at the hang issue. Could you please let us know 
>> how you built mvapich2? The output of mpiname -a will help. Could you 
>> please send us the output of ibstat and ibv_devinfo from the nodes?
>>
>>
>> Thx,
>>
>> Hari.
>>
>>
>> On May 23, 2017 4:26 PM, "Rick Warner" <rick at microway.com 
>> <mailto:rick at microway.com>> wrote:
>>
>>     Hi all,
>>
>>     I'm having some strange behavior with mvapich 2.2 on a small
>>     Ubuntu 16.04 cluster.  The cluster has ConnectX4 EDR IB HCAs in
>>     every node.  The compute nodes have (9) Geforce 1080s each.
>>     They're named master and node2 through node5.
>>
>>     I've installed MOFED 4.0 on the cluster to begin with.  OpenMPI
>>     from that works fine. CUDA8 is also installed
>>
>>     I first installed mvapich2-gdr, but when I tried running an
>>     example job (basic cpi test) it hung.  I then did some reading
>>     that indicated mvapich2-gdr was just for Tesla/Quadro, and not
>>     for Geforce, so I removed mvapich2-gdr and build regular mvapich2
>>     from source instead. Is that true? Should I be using the gdr
>>     build with Geforce cards?
>>
>>     With the copy I build from source, I reproduced the same hang
>>     running a basic 2 process job on 2 of the compute nodes. However,
>>     I found that if I use the master as 1 of the 2 systems, the job
>>     works fine (I hadn't tried this with gdr before removing, might
>>     have been the same there).  It only fails if I use 2 (or more)
>>     different computes nodes together.  It also works if I send 2
>>     processes to the same node.
>>
>>
>>     microway at master:~$ mpirun -np 2 --host master,node2 -env
>>     MV2_USE_CUDA 0 ./cpi-mvapich2
>>     NVIDIA: no NVIDIA devices found
>>     Process 0 of 2 on master
>>     Process 1 of 2 on node2
>>     pi is approximately 3.1415926544231318, Error is 0.0000000008333387
>>     wall clock time = 1.092004
>>     *******WORKED*******
>>
>>     microway at master:~$ mpirun -np 2 --host master,node3 -env
>>     MV2_USE_CUDA 0 ./cpi-mvapich2
>>     NVIDIA: no NVIDIA devices found
>>     Process 0 of 2 on master
>>     Process 1 of 2 on node3
>>     pi is approximately 3.1415926544231318, Error is 0.0000000008333387
>>     wall clock time = 0.820147
>>     *******WORKED*******
>>
>>     microway at master:~$ mpirun -np 2 --host node2,node2 -env
>>     MV2_USE_CUDA 0 ./cpi-mvapich2
>>     Process 0 of 2 on node2
>>     Process 1 of 2 on node2
>>     pi is approximately 3.1415926544231318, Error is 0.0000000008333387
>>     wall clock time = 0.005124
>>     *******WORKED*******
>>
>>     microway at master:~$ mpirun -np 2 --host node2,node3 -env
>>     MV2_USE_CUDA 0 ./cpi-mvapich2
>>     *******HANGS HERE - NEVER RETURNS UNTIL CTRL-C*******
>>
>>
>>     I'm using the MV2_USE_CUDA environment variable because the
>>     master does not have cuda devices.
>>
>>     However, mpirun_rsh works:
>>     microway at master:~$ mpirun_rsh -np 2 node2 node3 MV2_USE_CUDA=0
>>     ./cpi-mvapich2
>>     Process 0 of 2 on node2
>>     Process 1 of 2 on node3
>>     pi is approximately 3.1415926544231318, Error is 0.0000000008333387
>>     wall clock time = 0.128403
>>
>>
>>
>>     This isn't making sense to me.  The debugging I've done so far
>>     with strace and gdb has revealed rank 0 is waiting around line
>>     1630 of src/mpid/ch3/channels/mrail/src/rdma/ch3_smp_progress.c
>>     in the function MPIDI_CH3I_CM_SHMEM_Sync. Here is a backtrace I
>>     created by sending a SIGSEGV to the process:
>>     microway at master:~$ mpirun -np 2 --host node2,node3 -env
>>     MV2_USE_CUDA 0 ./cpi-mvapich2
>>     [node2:9777 :0] Caught signal 11 (Segmentation fault)
>>     ==== backtrace ====
>>         0  /opt/mellanox/mxm/lib/libmxm.so.2(+0x3c69c) [0x7fab0802f69c]
>>         1  /lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7fab0ad944b0]
>>         2
>>     /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3I_CM_SHMEM_Sync+0x86)
>>     [0x7fab0b5c6e7b]
>>         3
>>     /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3I_CM_Create_region+0x280)
>>     [0x7fab0b5c73ff]
>>         4
>>     /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3I_MRAIL_CM_Alloc+0x2c)
>>     [0x7fab0b5e3883]
>>         5
>>     /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3_Init+0x638)
>>     [0x7fab0b5b2c3d]
>>         6
>>     /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPID_Init+0x323)
>>     [0x7fab0b59abf0]
>>         7
>>     /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIR_Init_thread+0x411)
>>     [0x7fab0b48fb01]
>>         8
>>     /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPI_Init+0x19a)
>>     [0x7fab0b48ea49]
>>         9 ./cpi-mvapich2() [0x400aed]
>>        10 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)
>>     [0x7fab0ad7f830]
>>        11 ./cpi-mvapich2() [0x400989]
>>     ===================
>>
>>     ===================================================================================
>>     =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>     =   PID 9777 RUNNING AT node2
>>     =   EXIT CODE: 139
>>     =   CLEANING UP REMAINING PROCESSES
>>     =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>     ===================================================================================
>>     [proxy:0:1 at node3] HYD_pmcd_pmip_control_cmd_cb
>>     (pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
>>     [proxy:0:1 at node3] HYDT_dmxu_poll_wait_for_event
>>     (tools/demux/demux_poll.c:76): callback returned error status
>>     [proxy:0:1 at node3] main (pm/pmiserv/pmip.c:206): demux engine
>>     error waiting for event
>>     YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation
>>     fault (signal 11)
>>     This typically refers to a problem with your application.
>>     Please see the FAQ page for debugging suggestions
>>
>>     Here is mpiexec -info:
>>     microway at master:~$ mpiexec -info
>>     HYDRA build details:
>>         Version:    3.1.4
>>         Release Date:     Wed Sep  7 14:33:43 EDT 2016
>>         CC:         gcc
>>         CXX:          g++
>>         F77:  gfortran
>>         F90:  gfortran
>>         Configure options: '--disable-option-checking'
>>     '--prefix=/usr/local/mpi/gcc/mvapich2-2.2' '--localstatedir=/var'
>>     '--disable-static' '--enable-shared'
>>     '--with-mxm=/opt/mellanox/mxm' '--with-hcoll=/opt/mellanox/hcoll'
>>     '--with-knem=/opt/knem-1.1.2.90mlnx1' '--without-slurm'
>>     '--disable-mcast' '--without-cma' '--without-hydra-ckpointlib'
>>     '--enable-g=dbg' '--enable-cuda' '--with-cuda=/usr/local/cuda'
>>     '--enable-fast=ndebug' '--cache-file=/dev/null' '--srcdir=.'
>>     'CC=gcc' 'CFLAGS= -DNDEBUG -DNVALGRIND -g'
>>     'LDFLAGS=-L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -L/lib
>>     -L/lib -L/opt/mellanox/hcoll/lib64 -L/opt/mellanox/hcoll/lib
>>     -L/lib -Wl,-rpath,/lib -L/lib -Wl,-rpath,/lib -L/lib -L/lib'
>>     'LIBS=-lcudart -lcuda -lrdmacm -libumad -libverbs -ldl -lrt -lm
>>     -lpthread ' 'CPPFLAGS=-I/usr/local/cuda/include
>>     -I/opt/mellanox/hcoll/include
>>     -I/mcms/build/mvapich/source/mvapich2-2.2/src/mpl/include
>>     -I/mcms/build/mvapich/source/mvapich2-2.2/src/mpl/include
>>     -I/mcms/build/mvapich/source/mvapich2-2.2/src/openpa/src
>>     -I/mcms/build/mvapich/source/mvapich2-2.2/src/openpa/src
>>     -D_REENTRANT
>>     -I/mcms/build/mvapich/source/mvapich2-2.2/src/mpi/romio/include
>>     -I/include -I/include -I/include -I/include'
>>         Process Manager:      pmi
>>         Launchers available:    ssh rsh fork slurm ll lsf sge manual
>>     persist
>>         Topology libraries available:         hwloc
>>         Resource management kernels available:  user slurm ll lsf sge
>>     pbs cobalt
>>     Checkpointing libraries available:
>>         Demux engines available:  poll select
>>
>>
>>
>>     If there is any other needed info please let me know.
>>
>>     Thanks,
>>     Rick
>>
>>
>>
>>     _______________________________________________
>>     mvapich-discuss mailing list
>>     mvapich-discuss at cse.ohio-state.edu
>>     <mailto:mvapich-discuss at cse.ohio-state.edu>
>>     http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>     <http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss>
>>
>>
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20170525/c0245a33/attachment-0001.html>


More information about the mvapich-discuss mailing list