[mvapich-discuss] Unable to run multiple coprocessors with MVAPICH2-MIC
Bryant Lam
blam at hcs.ufl.edu
Sun May 24 23:28:13 EDT 2015
Thanks for the heads up. I appreciate the response.
Bryant
On 05/24/2015 09:30 PM, khaled hamidouche wrote:
> Hi Bryant,
>
> MVAPICH2-MIC requires the availability of IB HCA to work even for
> intranode Jobs.
>
> Thanks
>
> On Fri, May 22, 2015 at 12:16 AM, Bryant Lam <blam at hcs.ufl.edu
> <mailto:blam at hcs.ufl.edu>> wrote:
>
> I'm experimenting with MVAPICH2-MIC on a server with multiple
> Intel Xeon
> Phi coprocessors in it. I can successfully execute single-device MPI
> runs (e.g., only on the localhost, mic0, mic1, etc.), but when I pair
> any of the two together, I run into startup issues:
>
> > ${MV2MIC_PATH}/intel64/bin/mpirun_rsh -config config -hostfile
> hosts
> # From the host.
>
> Max MV2_DEFAULT_MAX_SG_LIST is 0, set to 1
> Max MV2_SRQ_SIZE is 0, set to 4096
> [cli_0]: aborting job:
> Fatal error in MPI_Init:
> Other MPI error, error stack:
> MPIR_Init_thread(483)....:
> MPID_Init(363)...........: channel initialization failed
> MPIDI_CH3_Init(438)......:
> MPIDI_CH3I_RDMA_init(325):
> rdma_iba_hca_init(879)...: Attributes failed sanity check
>
> [servername:mpispawn_0][readline] Unexpected End-Of-File on file
> descriptor 5. MPI process died?
> [servername:mpispawn_0][mtpmi_processops] Error while reading PMI
> socket. MPI process died?
> [servername:mpispawn_0][child_handler] MPI process (rank: 0, pid:
> 15209)
> exited with status 1
>
> Max MV2_DEFAULT_MAX_SG_LIST is 0, set to 1
> Max MV2_SRQ_SIZE is 0, set to 4096
> [cli_1]: aborting job:
> Fatal error in MPI_Init:
> Other MPI error, error stack:
> MPIR_Init_thread(483)....:
> MPID_Init(363)...........: channel initialization failed
> MPIDI_CH3_Init(438)......:
> MPIDI_CH3I_RDMA_init(325):
> rdma_iba_hca_init(879)...: Attributes failed sanity check
>
> [servername-mic0:mpispawn_1][readline] Unexpected End-Of-File on file
> descriptor 5. MPI process died?
> [servername-mic0:mpispawn_1][mtpmi_processops] Error while reading PMI
> socket. MPI process died?
> [servername-mic0:mpispawn_1][child_handler] MPI process (rank: 1, pid:
> 33510) exited with status 1
>
> > cat config
> -n 1 : $PWD/exe.host
> -n 1 : $PWD/exe
>
> > cat hosts
> localhost:1
> mic0:1
>
> The README file included with MVAPICH2-MIC states that
> MV2_IBA_HCA=mlx4_0 needs to be set in the environment (i.e., export
> MV2_IBA_HCA=mlx4_0), but this server does not have an InfiniBand
> card. I
> intend to only connect via Intel SCIF.
>
> 1. Does MVAPICH2_MIC work without an InfiniBand card if I intend to
> only communicate within a node? (e.g., export MV2_IBA_HCA=scif0)
>
> 2. Is my startup error "Attributes failed sanity check" related
> to #1?
>
> Thanks,
>
> Bryant
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> <mailto:mvapich-discuss at cse.ohio-state.edu>
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150524/2bfe1722/attachment-0001.html>
More information about the mvapich-discuss
mailing list