[mvapich-discuss] mvapich2 2.3a multiple HCAs issue

Hari Subramoni subramoni.1 at osu.edu
Thu Jun 29 07:51:50 EDT 2017


Hello,

With MVAPICH2 2.3a this is the expected behavior. We're working on a patch
to enhance this and will be available with the upcoming release.

Thx,
Hari.

On Jun 29, 2017 7:03 AM, "Jamil Appa" <jamil.appa at zenotech.com> wrote:

> Hi
>
>    I am using mvapich2 2.3a on a system with multiple HCAs and it appears
> the logic in choosing an HCA is failing
>
> The system has the following HCAs
>
>   mlx4_0 - 2 ports set to ethernet
>   mlx5_0 - 2 ports disabled
>   mlx5_1 - 1 port set to IB
>
> mpiexec -env MV2_USE_THREAD_WARNING 0 -env MV2_SHOW_HCA_BINDING 0 -env
> MV2_CPU_BINDING_LEVEL numanode -env MV2_CPU_BINDING_POLICY scatter
>
>  Fails with the following message
>
>   [cli_2]: aborting job:
> Fatal error in PMPI_Init_thread:
> Other MPI error, error stack:
> MPIR_Init_thread(490)............:
> MPID_Init(381)...................: channel initialization failed
> MPIDI_CH3_Init(320)..............: rdma_get_control_parameters
> rdma_get_control_parameters(1534):
> rdma_open_hca(701)...............: No active HCAs found on the system!!! 0
>
>  If I force the use of a single HCA the code works as expected
>
>  mpiexec -env MV2_NUM_HCAS 1 -env MV2_NUM_PORTS 1 -env
> MV2_USE_THREAD_WARNING 0 -env MV2_SHOW_HCA_BINDING 0 -env
> MV2_CPU_BINDING_LEVEL numanode -env MV2_CPU_BINDING_POLICY scatter
>
>  works correctly.
>
>   Is this expected behaviour?
>
>  Thanks
>
>  Jamil
>
> *Jamil Appa* | Co-Founder and Director | Zenotech
> [image: Papercut]
> [image: Tel:] +44 (0)7747 606 788 <+44%207747%20606788> [image: Zenotech
> LTD - Simulation Unlimited] <http://www.zenotech.com/>
> [image: Email:] jamil.appa at zenotech.com
> [image: Web:] www.zenotech.com
> [image: Papercut]
> [image: linkedin:] <http://uk.linkedin.com/pub/jamil-appa/1/165/120>[image:
> Twitter:] <https://twitter.com/zenotech>[image: Location:]
> <https://www.google.co.uk/maps/place/Bristol+%26+Bath+Science+Park/@51.500921,-2.478567,17z/data=!3m1!4b1!4m2!3m1!1s0x48719ab86a5a9f7d:0xd17394f3400abb0a>
>
> Company Registration No : 07926926 | VAT No : 128198591
>
> Registered Office : 1 Larkfield Grove, Chepstow, Monmouthshire, NP16 5UF,
> UK
>
> Address : Bristol & Bath Science Park, Dirac Cres, Emersons Green, Bristol
> BS16 7FR
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20170629/9da9debc/attachment-0001.html>


More information about the mvapich-discuss mailing list