[mvapich-discuss] Options to let MPI use different HCAs or IB ports on different servers

Hari Subramoni subramoni.1 at osu.edu
Sun Mar 5 15:37:04 EST 2017


Hello,

If you want to make different processes on a node use different HCAs,
please look at the MV2_PROCESS_TO_RAIL_MAPPING environment variable. Once
the local process decides what HCA to use, the remote process will
establish connections through this HCA.

The following section of the userguide has more information on this.

http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2-userguide.html#x1-670006.13

Please let me know if you've any further questions.

Thx,
Hari.

On Sat, Mar 4, 2017 at 9:00 PM, Junjie Qian <junjieqian at outlook.com> wrote:

> Hi Hari,
>
>
> Thank you for your explanations! It is very clear.
>
> But I have another question, can you also give me some suggestions on how
> to configure MVAPICH2 options?
>
>
> Considering following scenario that there are two servers (each server has
> NUMA topology and each NUMA node has one HCA). The MPI process is scheduled
> to run on different node.
>
> The problem is, is there a way to send a MPI command so the MPI process
> can be aware of which HCA port to use on different server? My current
> option is "mpirun_rsh <nps, hosts>  MV2_IBA_HCA=mlx5_0 MV2_IBA_HCA=mlx5_1
> MV2_NUM_PORTS=1 MV2_DEFAULT_PORT=1 prog".
>
> Is there a way to let MPI know which HCA on the NUMA node to use?
>
> server 0:
>                  ________________                      ________________
>
>  mlx5_0  |   *MPI Process*        |                    |
>                 | mlx5_1
>        <---- |                                 |                     |
>                             |------>
>                 |     NUMA 0             |                    |
>  NUMA 1          |
>                 |_______________|                    |______________ _|
>
> server 1:
>                  ________________                      ________________
>
>  mlx5_0  |                                 |                     |   *MPI
> Process*       | mlx5_1
>        <---- |                                 |                     |
>                            |------>
>                 |     NUMA 0             |                    |
>  NUMA 1         |
>                 |_______________|                    |_______________|
>
> Thank you
> Best
> Junjie Qian
> ------------------------------
> *From:* hari.subramoni at gmail.com <hari.subramoni at gmail.com> on behalf of
> Hari Subramoni <subramoni.1 at osu.edu>
> *Sent:* Saturday, March 4, 2017 6:56 AM
> *To:* Junjie Qian
> *Cc:* mvapich-discuss at cse.ohio-state.edu
> *Subject:* Re: [mvapich-discuss] Options to let MPI use different HCAs or
> IB ports on different servers
>
> Hello,
>
> I do not completely understand what you mean by "IB driver". If you mean
> that you have multiple HCAs, and you want MVAPICH2 to utilize both of them,
> then you can to use one of the following commands
>
> mpirun_rsh -np 2 n0 n1 MV2_IBA_HCA=mlx5_0:mlx5_1 prog
>
> mpirun_rsh -np 2 n0 n1 MV2_NUM_HCAS=2 prog
>
> The equivalent of name "mlx5_0" for your system can be obtained from the
> "ibstat" command
>
> [subramon at haswell1 ~]$ ibstat
> CA 'mlx5_0'
>         CA type: MT4119
>         Number of ports: 1
>         Firmware version: 16.18.0160
> <...snip...>
>
> MVAPICH2 has several levels of support for hardware configurations that
> have more than one InfiniBand end point (referred to as 'multi-rail'
> configuration by MVAPICH2). For example, MVAPICH2 has support to use
> multiple HCAs, multiple ports per HCA and multiple InfiniBand queue pairs
> per port.
>
> Our userguide has a lot of information about the various multi-rail modes
> supported by MVAPICH2 and how to use them. I've put links to the relevant
> sections of the userguide below. Please browse through them for detailed
> information.
>
> http://mvapich.cse.ohio-state.edu/static/media/mvapich/
> mvapich2-2.2-userguide.html#x1-660006.12
> Contents
> <http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2-userguide.html#x1-660006.12>
> mvapich.cse.ohio-state.edu
> 1 Overview of the MVAPICH Project. InfiniBand, Omni-Path, Ethernet/iWARP
> and RDMA over Converged Ethernet (RoCE) are emerging as high-performance
> networking ...
>
>
> http://mvapich.cse.ohio-state.edu/static/media/mvapich/
> mvapich2-2.2-userguide.html#x1-670006.13
>
> I hope I was able to address your concerns. Please feel to let me know if
> you have any follow-up questions.
>
> Regards,
> Hari.
>
> On Fri, Mar 3, 2017 at 11:42 PM, Junjie Qian <junjieqian at outlook.com>
> wrote:
>
>> Hi List,
>>
>>
>> I am new to MVAPICH2, and have some questions regarding using options to
>> run.
>>
>>
>> The issue is that, there are two servers, each server has two IB ports
>> with NUMA and the IB driver version may be different. Are there options to
>> make MPI run on different server knows which IB driver version and port
>> should be used on different server.
>>
>>
>> My current choice is, specify both IB driver versions with
>> (MV2_IBA_HCA=driver_0 and MV2_IBA_HCA=driver_1) and ignoring "
>> MV2_DEFAULT_PORT" in the MPI command.
>>
>>
>> Is there any suggestions on this?
>>
>> Thank you
>>
>> Best
>>
>> Junjie Qian
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20170305/8f08dea7/attachment-0001.html>


More information about the mvapich-discuss mailing list