[mvapich-discuss] recommended settings for heterogeneous multi-rail support?

Rick Warner rick at microway.com
Thu Mar 16 17:31:20 EDT 2017


Hi All,

I'm working with a cluster that has (1) ConnectX 3 HCA in 9 out of 10 
compute nodes, but the 10th node has 2 HCAs installed (plus GPUs)

What is the recommended way of making use of both HCAs on the 1 node?  
If I run an MPI job without specifying anything regarding the HCAs it 
fails like this:

[microway at athena-int ~]$ mpirun -np 10 --machinefile /etc/nodes 
./cpi-mvapich
Process 7 of 10 on athena-7
Process 1 of 10 on athena-1
Process 0 of 10 on athena-int
Process 4 of 10 on athena-4
Process 8 of 10 on athena-8
Process 6 of 10 on athena-6
Process 5 of 10 on athena-5
Process 3 of 10 on athena-3
Process 2 of 10 on athena-2
Process 9 of 10 on athena-gpu-1
[athena-gpu-1:mpi_rank_9][error_sighandler] Caught error: Segmentation 
fault (signal 11)

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 7621 RUNNING AT athena-gpu-1
=   EXIT CODE: 139
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:0:0 at athena-int] HYD_pmcd_pmip_control_cmd_cb 
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:0 at athena-int] HYDT_dmxu_poll_wait_for_event 
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:0 at athena-int] main (pm/pmiserv/pmip.c:206): demux engine error 
waiting for event
[proxy:0:5 at athena-5] HYD_pmcd_pmip_control_cmd_cb 
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:5 at athena-5] HYDT_dmxu_poll_wait_for_event 
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:5 at athena-5] main (pm/pmiserv/pmip.c:206): demux engine error 
waiting for event
[proxy:0:1 at athena-1] HYD_pmcd_pmip_control_cmd_cb 
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:1 at athena-1] HYDT_dmxu_poll_wait_for_event 
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:1 at athena-1] main (pm/pmiserv/pmip.c:206): demux engine error 
waiting for event
[proxy:0:2 at athena-2] HYD_pmcd_pmip_control_cmd_cb 
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:2 at athena-2] HYDT_dmxu_poll_wait_for_event 
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:2 at athena-2] main (pm/pmiserv/pmip.c:206): demux engine error 
waiting for event
[proxy:0:3 at athena-3] HYD_pmcd_pmip_control_cmd_cb 
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:3 at athena-3] HYDT_dmxu_poll_wait_for_event 
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:3 at athena-3] main (pm/pmiserv/pmip.c:206): demux engine error 
waiting for event
[proxy:0:4 at athena-4] HYD_pmcd_pmip_control_cmd_cb 
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:4 at athena-4] HYDT_dmxu_poll_wait_for_event 
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:4 at athena-4] main (pm/pmiserv/pmip.c:206): demux engine error 
waiting for event
[proxy:0:6 at athena-6] HYD_pmcd_pmip_control_cmd_cb 
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:6 at athena-6] HYDT_dmxu_poll_wait_for_event 
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:6 at athena-6] main (pm/pmiserv/pmip.c:206): demux engine error 
waiting for event
[proxy:0:7 at athena-7] HYD_pmcd_pmip_control_cmd_cb 
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:7 at athena-7] HYDT_dmxu_poll_wait_for_event 
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:7 at athena-7] main (pm/pmiserv/pmip.c:206): demux engine error 
waiting for event
[proxy:0:8 at athena-8] HYD_pmcd_pmip_control_cmd_cb 
(pm/pmiserv/pmip_cb.c:909): assert (!closed) failed
[proxy:0:8 at athena-8] HYDT_dmxu_poll_wait_for_event 
(tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:8 at athena-8] main (pm/pmiserv/pmip.c:206): demux engine error 
waiting for event
[mpiexec at athena-int] HYDT_bscu_wait_for_completion 
(tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated 
badly; aborting
[mpiexec at athena-int] HYDT_bsci_wait_for_completion 
(tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting 
for completion
[mpiexec at athena-int] HYD_pmci_wait_for_completion 
(pm/pmiserv/pmiserv_pmci.c:218): launcher returned error waiting for 
completion
[mpiexec at athena-int] main (ui/mpich/mpiexec.c:344): process manager 
error waiting for completion


athena-int is the master.  athena-1 through -8 are regular compute 
nodes, and -gpu-1 is the system with 2 IB cards since it has GPUs (1 IB 
card per CPU for direct IB->GPU transfer support [planning on adding 
more GPU systems later to use GPU direct]).

If I run forcing mvapich to use only the 1st HCA it works fine:

[microway at athena-int ~]$ mpirun -genv MV2_IBA_HCA mlx4_0 -np 10 
--machinefile /etc/nodes ./cpi-mvapich
Process 6 of 10 on athena-6
Process 7 of 10 on athena-7
Process 1 of 10 on athena-1
Process 2 of 10 on athena-2
Process 4 of 10 on athena-4
Process 3 of 10 on athena-3
Process 5 of 10 on athena-5
Process 8 of 10 on athena-8
Process 0 of 10 on athena-int
Process 9 of 10 on athena-gpu-1
pi is approximately 3.1415926544231256, Error is 0.0000000008333325
wall clock time = 0.022811


I've played around with various MV2_ multirail settings but have not had 
any luck. What is the recommended way to configure and use a setup like 
this?

Thanks,
Rick


More information about the mvapich-discuss mailing list