[mvapich-discuss] mvapich2 2.3a multiple HCAs issue

Jamil Appa jamil.appa at zenotech.com
Thu Jun 29 07:03:21 EDT 2017


Hi

   I am using mvapich2 2.3a on a system with multiple HCAs and it appears
the logic in choosing an HCA is failing

The system has the following HCAs

  mlx4_0 - 2 ports set to ethernet
  mlx5_0 - 2 ports disabled
  mlx5_1 - 1 port set to IB

mpiexec -env MV2_USE_THREAD_WARNING 0 -env MV2_SHOW_HCA_BINDING 0 -env
MV2_CPU_BINDING_LEVEL numanode -env MV2_CPU_BINDING_POLICY scatter

 Fails with the following message

  [cli_2]: aborting job:
Fatal error in PMPI_Init_thread:
Other MPI error, error stack:
MPIR_Init_thread(490)............:
MPID_Init(381)...................: channel initialization failed
MPIDI_CH3_Init(320)..............: rdma_get_control_parameters
rdma_get_control_parameters(1534):
rdma_open_hca(701)...............: No active HCAs found on the system!!! 0

 If I force the use of a single HCA the code works as expected

 mpiexec -env MV2_NUM_HCAS 1 -env MV2_NUM_PORTS 1 -env
MV2_USE_THREAD_WARNING 0 -env MV2_SHOW_HCA_BINDING 0 -env
MV2_CPU_BINDING_LEVEL numanode -env MV2_CPU_BINDING_POLICY scatter

 works correctly.

  Is this expected behaviour?

 Thanks

 Jamil

*Jamil Appa* | Co-Founder and Director | Zenotech
[image: Papercut]
[image: Tel:] +44 (0)7747 606 788 [image: Zenotech LTD - Simulation
Unlimited] <http://www.zenotech.com/>
[image: Email:] jamil.appa at zenotech.com
[image: Web:] www.zenotech.com
[image: Papercut]
[image: linkedin:] <http://uk.linkedin.com/pub/jamil-appa/1/165/120>[image:
Twitter:] <https://twitter.com/zenotech>[image: Location:]
<https://www.google.co.uk/maps/place/Bristol+%26+Bath+Science+Park/@51.500921,-2.478567,17z/data=!3m1!4b1!4m2!3m1!1s0x48719ab86a5a9f7d:0xd17394f3400abb0a>

Company Registration No : 07926926 | VAT No : 128198591

Registered Office : 1 Larkfield Grove, Chepstow, Monmouthshire, NP16 5UF, UK

Address : Bristol & Bath Science Park, Dirac Cres, Emersons Green, Bristol
BS16 7FR
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20170629/bd500386/attachment-0001.html>


More information about the mvapich-discuss mailing list