[mvapich-discuss] Options to let MPI use different HCAs or IB ports on different servers

Junjie Qian junjieqian at outlook.com
Sat Mar 4 21:00:03 EST 2017

Hi Hari,

Thank you for your explanations! It is very clear.

But I have another question, can you also give me some suggestions on how to configure MVAPICH2 options?

Considering following scenario that there are two servers (each server has NUMA topology and each NUMA node has one HCA). The MPI process is scheduled to run on different node.

The problem is, is there a way to send a MPI command so the MPI process can be aware of which HCA port to use on different server? My current option is "mpirun_rsh <nps, hosts>  MV2_IBA_HCA=mlx5_0 MV2_IBA_HCA=mlx5_1 MV2_NUM_PORTS=1 MV2_DEFAULT_PORT=1 prog".

Is there a way to let MPI know which HCA on the NUMA node to use?

server 0:
                 ________________                      ________________
 mlx5_0  |   MPI Process        |                    |                                 | mlx5_1
       <---- |                                 |                     |                                 |------>
                |     NUMA 0             |                    |        NUMA 1          |
                |_______________|                    |______________ _|

server 1:
                 ________________                      ________________
 mlx5_0  |                                 |                     |   MPI Process       | mlx5_1
       <---- |                                 |                     |                                |------>
                |     NUMA 0             |                    |        NUMA 1         |
                |_______________|                    |_______________|

Thank you
Junjie Qian
From: hari.subramoni at gmail.com <hari.subramoni at gmail.com> on behalf of Hari Subramoni <subramoni.1 at osu.edu>
Sent: Saturday, March 4, 2017 6:56 AM
To: Junjie Qian
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: [mvapich-discuss] Options to let MPI use different HCAs or IB ports on different servers


I do not completely understand what you mean by "IB driver". If you mean that you have multiple HCAs, and you want MVAPICH2 to utilize both of them, then you can to use one of the following commands

mpirun_rsh -np 2 n0 n1 MV2_IBA_HCA=mlx5_0:mlx5_1 prog

mpirun_rsh -np 2 n0 n1 MV2_NUM_HCAS=2 prog

The equivalent of name "mlx5_0" for your system can be obtained from the "ibstat" command

[subramon at haswell1 ~]$ ibstat
CA 'mlx5_0'
        CA type: MT4119
        Number of ports: 1
        Firmware version: 16.18.0160

MVAPICH2 has several levels of support for hardware configurations that have more than one InfiniBand end point (referred to as 'multi-rail' configuration by MVAPICH2). For example, MVAPICH2 has support to use multiple HCAs, multiple ports per HCA and multiple InfiniBand queue pairs per port.

Our userguide has a lot of information about the various multi-rail modes supported by MVAPICH2 and how to use them. I've put links to the relevant sections of the userguide below. Please browse through them for detailed information.

1 Overview of the MVAPICH Project. InfiniBand, Omni-Path, Ethernet/iWARP and RDMA over Converged Ethernet (RoCE) are emerging as high-performance networking ...


I hope I was able to address your concerns. Please feel to let me know if you have any follow-up questions.


On Fri, Mar 3, 2017 at 11:42 PM, Junjie Qian <junjieqian at outlook.com<mailto:junjieqian at outlook.com>> wrote:

Hi List,

I am new to MVAPICH2, and have some questions regarding using options to run.

The issue is that, there are two servers, each server has two IB ports with NUMA and the IB driver version may be different. Are there options to make MPI run on different server knows which IB driver version and port should be used on different server.

My current choice is, specify both IB driver versions with (MV2_IBA_HCA=driver_0 and MV2_IBA_HCA=driver_1) and ignoring "MV2_DEFAULT_PORT" in the MPI command.

Is there any suggestions on this?

Thank you


Junjie Qian

mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20170305/a37f401a/attachment-0001.html>

More information about the mvapich-discuss mailing list