[mvapich-discuss] Unable to run multiple coprocessors with MVAPICH2-MIC

Bryant Lam blam at hcs.ufl.edu
Fri May 22 00:16:50 EDT 2015


I'm experimenting with MVAPICH2-MIC on a server with multiple Intel Xeon 
Phi coprocessors in it. I can successfully execute single-device MPI 
runs (e.g., only on the localhost, mic0, mic1, etc.), but when I pair 
any of the two together, I run into startup issues:

 > ${MV2MIC_PATH}/intel64/bin/mpirun_rsh -config config -hostfile hosts 
# From the host.

Max MV2_DEFAULT_MAX_SG_LIST is 0, set to 1
Max MV2_SRQ_SIZE is 0, set to 4096
[cli_0]: aborting job:
Fatal error in MPI_Init:
Other MPI error, error stack:
MPIR_Init_thread(483)....:
MPID_Init(363)...........: channel initialization failed
MPIDI_CH3_Init(438)......:
MPIDI_CH3I_RDMA_init(325):
rdma_iba_hca_init(879)...: Attributes failed sanity check

[servername:mpispawn_0][readline] Unexpected End-Of-File on file 
descriptor 5. MPI process died?
[servername:mpispawn_0][mtpmi_processops] Error while reading PMI 
socket. MPI process died?
[servername:mpispawn_0][child_handler] MPI process (rank: 0, pid: 15209) 
exited with status 1

Max MV2_DEFAULT_MAX_SG_LIST is 0, set to 1
Max MV2_SRQ_SIZE is 0, set to 4096
[cli_1]: aborting job:
Fatal error in MPI_Init:
Other MPI error, error stack:
MPIR_Init_thread(483)....:
MPID_Init(363)...........: channel initialization failed
MPIDI_CH3_Init(438)......:
MPIDI_CH3I_RDMA_init(325):
rdma_iba_hca_init(879)...: Attributes failed sanity check

[servername-mic0:mpispawn_1][readline] Unexpected End-Of-File on file 
descriptor 5. MPI process died?
[servername-mic0:mpispawn_1][mtpmi_processops] Error while reading PMI 
socket. MPI process died?
[servername-mic0:mpispawn_1][child_handler] MPI process (rank: 1, pid: 
33510) exited with status 1

 > cat config
-n 1 : $PWD/exe.host
-n 1 : $PWD/exe

 > cat hosts
localhost:1
mic0:1

The README file included with MVAPICH2-MIC states that 
MV2_IBA_HCA=mlx4_0 needs to be set in the environment (i.e., export 
MV2_IBA_HCA=mlx4_0), but this server does not have an InfiniBand card. I 
intend to only connect via Intel SCIF.

1.  Does MVAPICH2_MIC work without an InfiniBand card if I intend to 
only communicate within a node? (e.g., export MV2_IBA_HCA=scif0)

2.  Is my startup error "Attributes failed sanity check" related to #1?

Thanks,

Bryant


More information about the mvapich-discuss mailing list