[mvapich-discuss] mvapich2-2.3.3 over connectX-5 regression issue
Subramoni, Hari
subramoni.1 at osu.edu
Fri Apr 10 20:26:52 EDT 2020
Hi, Honggang.
Glad to know that it works for you. I am still trying to understand how changing the launcher changes the IB HCA selection behavior in MVAPICH2. To the best of my knowledge, the two does not have any interaction.
If you don't mind, can you let us know the following
1. output of ibstat on both nodes
2. what do you mean by IPoIB was configured on mlx5_0?
Thx,
Hari.
-----Original Message-----
From: Honggang LI <honli at redhat.com>
Sent: Friday, April 10, 2020 7:35 PM
To: Subramoni, Hari <subramoni.1 at osu.edu>
Cc: mvapich-discuss at cse.ohio-state.edu <mvapich-discuss at mailman.cse.ohio-state.edu>
Subject: Re: [mvapich-discuss] mvapich2-2.3.3 over connectX-5 regression issue
On Fri, Apr 10, 2020 at 01:52:20PM +0000, Subramoni, Hari wrote:
> Hi, Honggang.
>
> It looks like your systems have multiple network adapters that have been setup with different modes (IB and Ethernet). In such a scenario, I would recommend explicitly setting the network adapter you want MVAPICH2 to use.
>
> e.g. MV2_IBA_HCA=mlx5_0 or MV2_IBA_HCA=mlx5_1
MV2_IBA_HCA=mlx5_1 works for mpirun and mpirun_rsh. The IPoIB had been configured on mlx5_0. It seems mpirun and mpirun_rsh blindly pick up the first HCA port.
The workaround works, but it is still a regression issue because 2.3.2 does not need the workaround.
Thanks
[root at rdma-virt-02 ~]$ rpm -qf /usr/lib64/mvapich2/bin/mpirun
mvapich2-2.3.3-1.el8.x86_64
[root at rdma-virt-02 ~]$ cat hfile_one_core
172.31.0.202
172.31.0.203
[root at rdma-virt-02 ~]$ ip addr show | grep -w 172.31.0.202
inet 172.31.0.202/24 brd 172.31.0.255 scope global dynamic noprefixroute mlx5_ib0
[root at rdma-virt-02 ~]$ ip addr show mlx5_ib0
8: mlx5_ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256
link/infiniband 00:00:0b:ae:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:e7:0f:f6 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
inet 172.31.0.202/24 brd 172.31.0.255 scope global dynamic noprefixroute mlx5_ib0
valid_lft 2039sec preferred_lft 2039sec
inet6 fe80::e61d:2d03:e7:ff6/64 scope link noprefixroute
valid_lft forever preferred_lft forever
[root at rdma-virt-02 ~]$ ibstat mlx5_0
CA 'mlx5_0'
CA type: MT4115
Number of ports: 1
Firmware version: 12.25.1020
Hardware version: 0
Node GUID: 0xe41d2d0300e70ff6
System image GUID: 0xe41d2d0300e70ff6
Port 1:
State: Active
Physical state: LinkUp
Rate: 56
Base lid: 19
LMC: 0
SM lid: 13
Capability mask: 0x2659e848
Port GUID: 0xe41d2d0300e70ff6 <===
Link layer: InfiniBand
According to the "link/infiniband" hardware address and port GUID. The HCA is mlx5_0.
[root at rdma-virt-02 ~]$ /usr/lib64/mvapich2/bin/mpirun -genv MV2_IBA_HCA=mlx5_0 -np 2 -hostfile /root/hfile_one_core hostname rdma-virt-02.lab.bos.redhat.com rdma-virt-03.lab.bos.redhat.com
[root at rdma-virt-02 ~]$ /usr/lib64/mvapich2/bin/mpirun -genv MV2_IBA_HCA=mlx5_0 -np 2 -hostfile /root/hfile_one_core /usr/lib64/mvapich2/bin/mpitests-osu_latency
(hang on like mpirun_rsh, no output)
[root at rdma-virt-02 ~]$ /usr/lib64/mvapich2/bin/mpirun -genv MV2_IBA_HCA=mlx5_1 -np 2 -hostfile /root/hfile_one_core /usr/lib64/mvapich2/bin/mpitests-osu_latency
# OSU MPI Latency Test v5.4.1
# Size Latency (us)
0 1.03
1 1.08
2 1.07
4 1.07
8 1.07
16 1.11
32 1.11
64 1.13
128 1.19
256 1.59
512 1.68
1024 1.84
2048 2.19
4096 2.95
8192 4.40
16384 5.57
32768 7.21
65536 9.89
131072 15.24
262144 26.00
524288 47.65
1048576 91.39
2097152 177.73
4194304 351.36
After add 'export MV2_IBA_HCA=mlx5_1' in the ~/.bashrc on both machine, mpirun_rsh works.
[root at rdma-virt-02 ~]$ /usr/lib64/mvapich2/bin/mpirun_rsh -np 2 -hostfile /root/hfile_one_core /usr/lib64/mvapich2/bin/mpitests-osu_latency
# OSU MPI Latency Test v5.4.1
# Size Latency (us)
0 1.04
1 1.07
2 1.07
4 1.07
8 1.06
16 1.12
32 1.12
64 1.14
128 1.19
256 1.58
512 1.66
1024 1.82
2048 2.17
4096 2.93
8192 4.39
16384 5.52
32768 7.18
65536 9.87
131072 15.25
262144 25.97
524288 47.62
1048576 91.80
2097152 177.88
4194304 350.79
> Best,
> Hari.
>
> -----Original Message-----
> From: mvapich-discuss-bounces at cse.ohio-state.edu
> <mvapich-discuss-bounces at mailman.cse.ohio-state.edu> On Behalf Of
> Honggang LI
> Sent: Friday, April 10, 2020 3:56 AM
> To: mvapich-discuss at cse.ohio-state.edu
> <mvapich-discuss at mailman.cse.ohio-state.edu>
> Subject: [mvapich-discuss] mvapich2-2.3.3 over connectX-5 regression
> issue
>
> hi
>
> short summary:
> +----------+----------+-----------+
> |mvapich2 | mpirun | mpirun_rsh|
> |version | | |
> +----------+----------+-----------+
> |2.3.2 | works | hang |
> +----------+----------+-----------+
> |2.3.3 | failed | hang |
> +----------+----------+-----------+
>
> Is it possible to run something like 'git bisect' to narrow down the source of regression issue? It seems no git repo available for public.
> I don't know how to run 'git bisect' with the SVN repo.
>
> thanks
>
> [root at rdma-virt-02 ~]$ cat hfile_one_core
> 172.31.0.202
> 172.31.0.203
> [root at rdma-virt-02 ~]$ ip addr show | grep -w 172.31.0.202
> inet 172.31.0.202/24 brd 172.31.0.255 scope global dynamic
> noprefixroute mlx5_ib0
>
>
> [root at rdma-virt-02 ~]$ /usr/lib64/mvapich2/bin/mpirun -np 2 -hostfile
> /root/hfile_one_core /usr/lib64/mvapich2/bin/mpitests-osu_latency
> [rdma-virt-02.lab.bos.redhat.com:mpi_rank_0][handle_cqe] Send desc
> error in msg to 1, wc_opcode=0
> [rdma-virt-02.lab.bos.redhat.com:mpi_rank_0][handle_cqe] Msg from 1:
> wc.status=12, wc.wr_id=0x560c8bac9040, wc.opcode=0,
> vbuf->phead->type=0 = MPIDI_CH3_PKT_EAGER_SEND
> [rdma-virt-02.lab.bos.redhat.com:mpi_rank_0][handle_cqe]
> src/mpid/ch3/channels/mrail/src/gen2/ibv_channel_manager.c:548: [] Got
> completion with error 12, vendor code=0x81, dest rank=1
> : Protocol not supported (93)
> [rdma-virt-03.lab.bos.redhat.com:mpi_rank_1][handle_cqe] Send desc
> error in msg to 0, wc_opcode=0
> [rdma-virt-03.lab.bos.redhat.com:mpi_rank_1][handle_cqe] Msg from 0:
> wc.status=12, wc.wr_id=0x563896cf9040, wc.opcode=0,
> vbuf->phead->type=0 = MPIDI_CH3_PKT_EAGER_SEND
> [rdma-virt-03.lab.bos.redhat.com:mpi_rank_1][handle_cqe]
> src/mpid/ch3/channels/mrail/src/gen2/ibv_channel_manager.c:548: [] Got
> completion with error 12, vendor code=0x81, dest rank=0
> : Protocol not supported (93)
>
> [root at rdma-virt-02 ~]$ dnf downgrade mvapich2 Updating Subscription Management repositories.
> Unable to read consumer identity
> This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
> Last metadata expiration check: 2:24:17 ago on Fri 10 Apr 2020 01:18:01 AM EDT.
> Dependencies resolved.
> ========================================================================================================================================
> Package Architecture Version Repository Size
> ======================================================================
> ==================================================================
> Downgrading:
> mvapich2 x86_64 2.3.2-2.el8 beaker-AppStream 3.1 M
>
> Transaction Summary
> ======================================================================
> ==================================================================
> Downgrade 1 Package
>
> Total download size: 3.1 M
> Is this ok [y/N]: y
> Downloading Packages:
> mvapich2-2.3.2-2.el8.x86_64.rpm 39 MB/s | 3.1 MB 00:00
> ----------------------------------------------------------------------------------------------------------------------------------------
> Total 39 MB/s | 3.1 MB 00:00
> Running transaction check
> Transaction check succeeded.
> Running transaction test
> Transaction test succeeded.
> Running transaction
> Preparing : 1/1
> Downgrading : mvapich2-2.3.2-2.el8.x86_64 1/2
> Cleanup : mvapich2-2.3.3-1.el8.x86_64 2/2
> Running scriptlet: mvapich2-2.3.3-1.el8.x86_64 2/2
> Verifying : mvapich2-2.3.2-2.el8.x86_64 1/2
> Verifying : mvapich2-2.3.3-1.el8.x86_64 2/2
> Installed products updated.
>
> Downgraded:
> mvapich2-2.3.2-2.el8.x86_64
>
> Complete!
> [root at rdma-virt-02 ~]$ /usr/lib64/mvapich2/bin/mpirun -np 2 -hostfile
> /root/hfile_one_core /usr/lib64/mvapich2/bin/mpitests-osu_latency
> # OSU MPI Latency Test v5.4.1
> # Size Latency (us)
> 0 1.24
> 1 1.29
> 2 1.29
> 4 1.29
> 8 1.29
> 16 1.34
> 32 1.35
> 64 1.36
> 128 1.42
> 256 1.82
> 512 1.92
> 1024 2.11
> 2048 2.53
> 4096 3.48
> 8192 5.19
> 16384 7.37
> 32768 10.12
> 65536 15.00
> 131072 24.69
> 262144 44.15
> 524288 82.97
> 1048576 160.92
> 2097152 316.19
> 4194304 626.91
> [root at rdma-virt-02 ~]$ /usr/lib64/mvapich2/bin/mpirun_rsh -np 2
> -hostfile /root/hfile_one_core
> /usr/lib64/mvapich2/bin/mpitests-osu_latency
>
> (hang on, no output)
>
> [root at rdma-virt-03 ~]$ ibstat
> CA 'mlx5_bond_0'
> CA type: MT4117
> Number of ports: 1
> Firmware version: 14.25.1020
> Hardware version: 0
> Node GUID: 0xe41d2d0300fda736
> System image GUID: 0xe41d2d0300fda736
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 25
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x00010000
> Port GUID: 0xe61d2dfffefda736
> Link layer: Ethernet
> CA 'mlx5_1'
> CA type: MT4115
> Number of ports: 1
> Firmware version: 12.25.1020
> Hardware version: 0
> Node GUID: 0xe41d2d0300e70e87
> System image GUID: 0xe41d2d0300e70e86
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 100
> Base lid: 30
> LMC: 0
> SM lid: 1
> Capability mask: 0x2659e848
> Port GUID: 0xe41d2d0300e70e87
> Link layer: InfiniBand
> CA 'mlx5_0'
> CA type: MT4115
> Number of ports: 1
> Firmware version: 12.25.1020
> Hardware version: 0
> Node GUID: 0xe41d2d0300e70e86
> System image GUID: 0xe41d2d0300e70e86
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 100
> Base lid: 20
> LMC: 0
> SM lid: 13
> Capability mask: 0x2659e848
> Port GUID: 0xe41d2d0300e70e86
> Link layer: InfiniBand
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
More information about the mvapich-discuss
mailing list