[mvapich-discuss] mpirun hang on simple cpi job with mvapich 2.2 and Ubuntu 16.04 with MOFED 4.0
Rick Warner
rick at microway.com
Wed May 24 10:51:14 EDT 2017
Thanks for the response Hari. I appreciate the help.
I extracted the 2.2 tarball and ran configure with the listed options in
the below mpiname -a output:
root at master:/mcms/build/mvapich/source/mvapich2-2.2# mpiname -a
MVAPICH2 2.2 Thu Sep 08 22:00:00 EST 2016 ch3:mrail
Compilation
CC: gcc -DNDEBUG -DNVALGRIND -g
CXX: g++ -DNDEBUG -DNVALGRIND -g
F77: gfortran -L/lib -L/lib -g
FC: gfortran -g
Configuration
--prefix=/usr/local/mpi/gcc/mvapich2-2.2 --localstatedir=/var
--disable-static --enable-shared --with-mxm=/opt/mellanox/mxm
--with-hcoll=/opt/mellanox/hcoll --with-knem=/opt/knem-1.1.2.90mlnx1
--without-slurm --disable-mcast --without-cma --without-hydra-ckpointlib
--enable-g=dbg --enable-cuda --with-cuda=/usr/local/cuda
--enable-fast=ndebug
ibstat from every system:
master: CA 'mlx4_0'
master: CA type: MT4103
master: Number of ports: 2
master: Firmware version: 2.40.7000
master: Hardware version: 0
master: Node GUID: 0x248a070300e883f0
master: System image GUID: 0x248a070300e883f0
master: Port 1:
master: State: Down
master: Physical state: Disabled
master: Rate: 10
master: Base lid: 0
master: LMC: 0
master: SM lid: 0
master: Capability mask: 0x04010000
master: Port GUID: 0x268a07fffee883f0
master: Link layer: Ethernet
master: Port 2:
master: State: Down
master: Physical state: Disabled
master: Rate: 10
master: Base lid: 0
master: LMC: 0
master: SM lid: 0
master: Capability mask: 0x04010000
master: Port GUID: 0x268a07fffee883f1
master: Link layer: Ethernet
master: CA 'mlx5_0'
master: CA type: MT4115
master: Number of ports: 1
master: Firmware version: 12.18.2000
master: Hardware version: 0
master: Node GUID: 0x248a070300a2eff0
master: System image GUID: 0x248a070300a2eff0
master: Port 1:
master: State: Active
master: Physical state: LinkUp
master: Rate: 100
master: Base lid: 1
master: LMC: 0
master: SM lid: 1
master: Capability mask: 0x2651e84a
master: Port GUID: 0x248a070300a2eff0
master: Link layer: InfiniBand
node2 : CA 'mlx5_0'
node2 : CA type: MT4115
node2 : Number of ports: 1
node2 : Firmware version: 12.18.2000
node2 : Hardware version: 0
node2 : Node GUID: 0x248a070300a2f0f0
node2 : System image GUID: 0x248a070300a2f0f0
node2 : Port 1:
node2 : State: Active
node2 : Physical state: LinkUp
node2 : Rate: 100
node2 : Base lid: 5
node2 : LMC: 0
node2 : SM lid: 1
node2 : Capability mask: 0x2651e848
node2 : Port GUID: 0x248a070300a2f0f0
node2 : Link layer: InfiniBand
node3 : CA 'mlx5_0'
node3 : CA type: MT4115
node3 : Number of ports: 1
node3 : Firmware version: 12.18.2000
node3 : Hardware version: 0
node3 : Node GUID: 0x248a070300a09ad0
node3 : System image GUID: 0x248a070300a09ad0
node3 : Port 1:
node3 : State: Active
node3 : Physical state: LinkUp
node3 : Rate: 100
node3 : Base lid: 3
node3 : LMC: 0
node3 : SM lid: 1
node3 : Capability mask: 0x2651e848
node3 : Port GUID: 0x248a070300a09ad0
node3 : Link layer: InfiniBand
node4 : CA 'mlx5_0'
node4 : CA type: MT4115
node4 : Number of ports: 1
node4 : Firmware version: 12.18.2000
node4 : Hardware version: 0
node4 : Node GUID: 0x248a070300a2efc8
node4 : System image GUID: 0x248a070300a2efc8
node4 : Port 1:
node4 : State: Active
node4 : Physical state: LinkUp
node4 : Rate: 100
node4 : Base lid: 2
node4 : LMC: 0
node4 : SM lid: 1
node4 : Capability mask: 0x2651e848
node4 : Port GUID: 0x248a070300a2efc8
node4 : Link layer: InfiniBand
node5 : CA 'mlx5_0'
node5 : CA type: MT4115
node5 : Number of ports: 1
node5 : Firmware version: 12.18.2000
node5 : Hardware version: 0
node5 : Node GUID: 0x248a070300a2f0e4
node5 : System image GUID: 0x248a070300a2f0e4
node5 : Port 1:
node5 : State: Active
node5 : Physical state: LinkUp
node5 : Rate: 100
node5 : Base lid: 6
node5 : LMC: 0
node5 : SM lid: 1
node5 : Capability mask: 0x2651e848
node5 : Port GUID: 0x248a070300a2f0e4
node5 : Link layer: InfiniBand
ibv_devinfo from all: (FYI - master has a dual port mellanox ethernet
card, currently unused)
root at master:/mcms/build/mvapich/source/mvapich2-2.2# scom -a ibv_devinfo
master: hca_id: mlx5_0
master: transport: InfiniBand (0)
master: fw_ver: 12.18.2000
master: node_guid: 248a:0703:00a2:eff0
master: sys_image_guid: 248a:0703:00a2:eff0
master: vendor_id: 0x02c9
master: vendor_part_id: 4115
master: hw_ver: 0x0
master: board_id: MT_2180110032
master: phys_port_cnt: 1
master: Device ports:
master: port: 1
master: state: PORT_ACTIVE (4)
master: max_mtu: 4096 (5)
master: active_mtu: 4096 (5)
master: sm_lid: 1
master: port_lid: 1
master: port_lmc: 0x00
master: link_layer: InfiniBand
master:
master: hca_id: mlx4_0
master: transport: InfiniBand (0)
master: fw_ver: 2.40.7000
master: node_guid: 248a:0703:00e8:83f0
master: sys_image_guid: 248a:0703:00e8:83f0
master: vendor_id: 0x02c9
master: vendor_part_id: 4103
master: hw_ver: 0x0
master: board_id: MT_1200111023
master: phys_port_cnt: 2
master: Device ports:
master: port: 1
master: state: PORT_DOWN (1)
master: max_mtu: 4096 (5)
master: active_mtu: 1024 (3)
master: sm_lid: 0
master: port_lid: 0
master: port_lmc: 0x00
master: link_layer: Ethernet
master:
master: port: 2
master: state: PORT_DOWN (1)
master: max_mtu: 4096 (5)
master: active_mtu: 1024 (3)
master: sm_lid: 0
master: port_lid: 0
master: port_lmc: 0x00
master: link_layer: Ethernet
master:
node2 : hca_id: mlx5_0
node2 : transport: InfiniBand (0)
node2 : fw_ver: 12.18.2000
node2 : node_guid: 248a:0703:00a2:f0f0
node2 : sys_image_guid: 248a:0703:00a2:f0f0
node2 : vendor_id: 0x02c9
node2 : vendor_part_id: 4115
node2 : hw_ver: 0x0
node2 : board_id: MT_2180110032
node2 : phys_port_cnt: 1
node2 : Device ports:
node2 : port: 1
node2 : state: PORT_ACTIVE (4)
node2 : max_mtu: 4096 (5)
node2 : active_mtu: 4096 (5)
node2 : sm_lid: 1
node2 : port_lid: 5
node2 : port_lmc: 0x00
node2 : link_layer: InfiniBand
node2 :
node3 : hca_id: mlx5_0
node3 : transport: InfiniBand (0)
node3 : fw_ver: 12.18.2000
node3 : node_guid: 248a:0703:00a0:9ad0
node3 : sys_image_guid: 248a:0703:00a0:9ad0
node3 : vendor_id: 0x02c9
node3 : vendor_part_id: 4115
node3 : hw_ver: 0x0
node3 : board_id: MT_2180110032
node3 : phys_port_cnt: 1
node3 : Device ports:
node3 : port: 1
node3 : state: PORT_ACTIVE (4)
node3 : max_mtu: 4096 (5)
node3 : active_mtu: 4096 (5)
node3 : sm_lid: 1
node3 : port_lid: 3
node3 : port_lmc: 0x00
node3 : link_layer: InfiniBand
node3 :
node4 : hca_id: mlx5_0
node4 : transport: InfiniBand (0)
node4 : fw_ver: 12.18.2000
node4 : node_guid: 248a:0703:00a2:efc8
node4 : sys_image_guid: 248a:0703:00a2:efc8
node4 : vendor_id: 0x02c9
node4 : vendor_part_id: 4115
node4 : hw_ver: 0x0
node4 : board_id: MT_2180110032
node4 : phys_port_cnt: 1
node4 : Device ports:
node4 : port: 1
node4 : state: PORT_ACTIVE (4)
node4 : max_mtu: 4096 (5)
node4 : active_mtu: 4096 (5)
node4 : sm_lid: 1
node4 : port_lid: 2
node4 : port_lmc: 0x00
node4 : link_layer: InfiniBand
node4 :
node5 : hca_id: mlx5_0
node5 : transport: InfiniBand (0)
node5 : fw_ver: 12.18.2000
node5 : node_guid: 248a:0703:00a2:f0e4
node5 : sys_image_guid: 248a:0703:00a2:f0e4
node5 : vendor_id: 0x02c9
node5 : vendor_part_id: 4115
node5 : hw_ver: 0x0
node5 : board_id: MT_2180110032
node5 : phys_port_cnt: 1
node5 : Device ports:
node5 : port: 1
node5 : state: PORT_ACTIVE (4)
node5 : max_mtu: 4096 (5)
node5 : active_mtu: 4096 (5)
node5 : sm_lid: 1
node5 : port_lid: 6
node5 : port_lmc: 0x00
node5 : link_layer: InfiniBand
Thanks!
Rick
On 05/24/2017 08:47 AM, Hari Subramoni wrote:
> Hi Rick,
>
> Sorry to hear that you are facing issues. Although we have not tested
> with GeForce cards internally, we believe that it will work.
>
> We're taking a look at the hang issue. Could you please let us know
> how you built mvapich2? The output of mpiname -a will help. Could you
> please send us the output of ibstat and ibv_devinfo from the nodes?
>
> Thx,
> Hari.
>
> On May 23, 2017 4:26 PM, "Rick Warner" <rick at microway.com
> <mailto:rick at microway.com>> wrote:
>
> Hi all,
>
> I'm having some strange behavior with mvapich 2.2 on a small
> Ubuntu 16.04 cluster. The cluster has ConnectX4 EDR IB HCAs in
> every node. The compute nodes have (9) Geforce 1080s each.
> They're named master and node2 through node5.
>
> I've installed MOFED 4.0 on the cluster to begin with. OpenMPI
> from that works fine. CUDA8 is also installed
>
> I first installed mvapich2-gdr, but when I tried running an
> example job (basic cpi test) it hung. I then did some reading
> that indicated mvapich2-gdr was just for Tesla/Quadro, and not for
> Geforce, so I removed mvapich2-gdr and build regular mvapich2 from
> source instead. Is that true? Should I be using the gdr build
> with Geforce cards?
>
> With the copy I build from source, I reproduced the same hang
> running a basic 2 process job on 2 of the compute nodes. However,
> I found that if I use the master as 1 of the 2 systems, the job
> works fine (I hadn't tried this with gdr before removing, might
> have been the same there). It only fails if I use 2 (or more)
> different computes nodes together. It also works if I send 2
> processes to the same node.
>
>
> microway at master:~$ mpirun -np 2 --host master,node2 -env
> MV2_USE_CUDA 0 ./cpi-mvapich2
> NVIDIA: no NVIDIA devices found
> Process 0 of 2 on master
> Process 1 of 2 on node2
> pi is approximately 3.1415926544231318, Error is 0.0000000008333387
> wall clock time = 1.092004
> *******WORKED*******
>
> microway at master:~$ mpirun -np 2 --host master,node3 -env
> MV2_USE_CUDA 0 ./cpi-mvapich2
> NVIDIA: no NVIDIA devices found
> Process 0 of 2 on master
> Process 1 of 2 on node3
> pi is approximately 3.1415926544231318, Error is 0.0000000008333387
> wall clock time = 0.820147
> *******WORKED*******
>
> microway at master:~$ mpirun -np 2 --host node2,node2 -env
> MV2_USE_CUDA 0 ./cpi-mvapich2
> Process 0 of 2 on node2
> Process 1 of 2 on node2
> pi is approximately 3.1415926544231318, Error is 0.0000000008333387
> wall clock time = 0.005124
> *******WORKED*******
>
> microway at master:~$ mpirun -np 2 --host node2,node3 -env
> MV2_USE_CUDA 0 ./cpi-mvapich2
> *******HANGS HERE - NEVER RETURNS UNTIL CTRL-C*******
>
>
> I'm using the MV2_USE_CUDA environment variable because the master
> does not have cuda devices.
>
> However, mpirun_rsh works:
> microway at master:~$ mpirun_rsh -np 2 node2 node3 MV2_USE_CUDA=0
> ./cpi-mvapich2
> Process 0 of 2 on node2
> Process 1 of 2 on node3
> pi is approximately 3.1415926544231318, Error is 0.0000000008333387
> wall clock time = 0.128403
>
>
>
> This isn't making sense to me. The debugging I've done so far
> with strace and gdb has revealed rank 0 is waiting around line
> 1630 of src/mpid/ch3/channels/mrail/src/rdma/ch3_smp_progress.c in
> the function MPIDI_CH3I_CM_SHMEM_Sync. Here is a backtrace I
> created by sending a SIGSEGV to the process:
> microway at master:~$ mpirun -np 2 --host node2,node3 -env
> MV2_USE_CUDA 0 ./cpi-mvapich2
> [node2:9777 :0] Caught signal 11 (Segmentation fault)
> ==== backtrace ====
> 0 /opt/mellanox/mxm/lib/libmxm.so.2(+0x3c69c) [0x7fab0802f69c]
> 1 /lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7fab0ad944b0]
> 2
> /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3I_CM_SHMEM_Sync+0x86)
> [0x7fab0b5c6e7b]
> 3
> /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3I_CM_Create_region+0x280)
> [0x7fab0b5c73ff]
> 4
> /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3I_MRAIL_CM_Alloc+0x2c)
> [0x7fab0b5e3883]
> 5
> /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIDI_CH3_Init+0x638)
> [0x7fab0b5b2c3d]
> 6
> /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPID_Init+0x323) [0x7fab0b59abf0]
> 7
> /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPIR_Init_thread+0x411)
> [0x7fab0b48fb01]
> 8
> /usr/local/mpi/gcc/mvapich2-2.2/lib64/libmpi.so.12(MPI_Init+0x19a)
> [0x7fab0b48ea49]
> 9 ./cpi-mvapich2() [0x400aed]
> 10 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)
> [0x7fab0ad7f830]
> 11 ./cpi-mvapich2() [0x400989]
> ===================
>
> ===================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = PID 9777 RUNNING AT node2
> = EXIT CODE: 139
> = CLEANING UP REMAINING PROCESSES
> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> ===================================================================================
> [proxy:0:1 at node3] HYD_pmcd_pmip_control_cmd_cb
> (pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
> [proxy:0:1 at node3] HYDT_dmxu_poll_wait_for_event
> (tools/demux/demux_poll.c:76): callback returned error status
> [proxy:0:1 at node3] main (pm/pmiserv/pmip.c:206): demux engine error
> waiting for event
> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation
> fault (signal 11)
> This typically refers to a problem with your application.
> Please see the FAQ page for debugging suggestions
>
> Here is mpiexec -info:
> microway at master:~$ mpiexec -info
> HYDRA build details:
> Version: 3.1.4
> Release Date: Wed Sep 7 14:33:43
> EDT 2016
> CC: gcc
> CXX: g++
> F77: gfortran
> F90: gfortran
> Configure options: '--disable-option-checking'
> '--prefix=/usr/local/mpi/gcc/mvapich2-2.2' '--localstatedir=/var'
> '--disable-static' '--enable-shared'
> '--with-mxm=/opt/mellanox/mxm' '--with-hcoll=/opt/mellanox/hcoll'
> '--with-knem=/opt/knem-1.1.2.90mlnx1' '--without-slurm'
> '--disable-mcast' '--without-cma' '--without-hydra-ckpointlib'
> '--enable-g=dbg' '--enable-cuda' '--with-cuda=/usr/local/cuda'
> '--enable-fast=ndebug' '--cache-file=/dev/null' '--srcdir=.'
> 'CC=gcc' 'CFLAGS= -DNDEBUG -DNVALGRIND -g'
> 'LDFLAGS=-L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -L/lib
> -L/lib -L/opt/mellanox/hcoll/lib64 -L/opt/mellanox/hcoll/lib
> -L/lib -Wl,-rpath,/lib -L/lib -Wl,-rpath,/lib -L/lib -L/lib'
> 'LIBS=-lcudart -lcuda -lrdmacm -libumad -libverbs -ldl -lrt -lm
> -lpthread ' 'CPPFLAGS=-I/usr/local/cuda/include
> -I/opt/mellanox/hcoll/include
> -I/mcms/build/mvapich/source/mvapich2-2.2/src/mpl/include
> -I/mcms/build/mvapich/source/mvapich2-2.2/src/mpl/include
> -I/mcms/build/mvapich/source/mvapich2-2.2/src/openpa/src
> -I/mcms/build/mvapich/source/mvapich2-2.2/src/openpa/src
> -D_REENTRANT
> -I/mcms/build/mvapich/source/mvapich2-2.2/src/mpi/romio/include
> -I/include -I/include -I/include -I/include'
> Process Manager: pmi
> Launchers available: ssh rsh fork slurm ll
> lsf sge manual persist
> Topology libraries available: hwloc
> Resource management kernels available: user slurm ll lsf sge
> pbs cobalt
> Checkpointing libraries available:
> Demux engines available: poll select
>
>
>
> If there is any other needed info please let me know.
>
> Thanks,
> Rick
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> <mailto:mvapich-discuss at cse.ohio-state.edu>
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> <http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20170524/65a0aa29/attachment-0001.html>
More information about the mvapich-discuss
mailing list