[Mvapich-discuss] MVAPICH 2.3.6 HCA Support for VMXNET3 or PVRDMA VMware Network Adapters

Subramoni, Hari subramoni.1 at osu.edu
Wed Dec 1 12:41:45 EST 2021


Hi, Nicholas.

According the output, the port is down. That is probably why MVAPICH2 is complaining that I cannot find an appropriate HCA. Can you please bring it up and try again?

state:                  PORT_DOWN

Thx,
Hari.

From: Clark, Nicholas - 1002 - MITLL <Nicholas.Clark at ll.mit.edu>
Sent: Wednesday, December 1, 2021 11:19 AM
To: Subramoni, Hari <subramoni.1 at osu.edu>
Cc: mvapich-discuss at lists.osu.edu
Subject: RE: [Mvapich-discuss] MVAPICH 2.3.6 HCA Support for VMXNET3 or PVRDMA VMware Network Adapters

Dear Hari,

These are the results of ibv_devinfo:
hca_id: vmw_pvrdma0
        transport:                      InfiniBand (0)
        fw_ver:                         3.0.000
        node_guid:                      0050:5600:009b:71e5
        sys_image_guid:                 0000:0000:0000:0000
        vendor_id:                      0x15ad
        vendor_part_id:                 2080
        hw_ver:                         0x1
        board_id:                       1
        phys_port_cnt:                  1
        max_mr_size:                    0xffffffff
        page_size_cap:                  0xc
        max_qp:                         32768
        max_qp_wr:                      1024
        device_cap_flags:               0x00201400
                                        PORT_ACTIVE_EVENT
                                        RC_RNR_NAK_GEN
                                        MEM_MGT_EXTENSIONS
        max_sge:                        16
        max_sge_rd:                     16
        max_cq:                         4096
        max_cqe:                        262144
        max_mr:                         262144
        max_pd:                         4096
        max_qp_rd_atom:                 16
        max_ee_rd_atom:                 0
        max_res_rd_atom:                0
        max_qp_init_rd_atom:            128
        max_ee_init_rd_atom:            0
        atomic_cap:                     ATOMIC_NONE (0)
        max_ee:                         0
        max_rdd:                        0
        max_mw:                         0
        max_raw_ipv6_qp:                0
        max_raw_ethy_qp:                0
        max_mcast_grp:                  0
        max_mcast_qp_attach:            0
        max_total_mcast_qp_attach:      0
        max_ah:                         1048576
        max_fmr:                        0
        max_srq:                        4096
        max_srq_wr:                     1024
        max_srq_sge:                    16
        max_pkeys:                      128
        local_ca_ack_delay:             5
        general_odp_caps:
        rc_odp_caps:
                                        NO SUPPORT
        uc_odp_caps:
                                        NO SUPPORT
        ud_odp_caps:
                                        NO SUPPORT
        xrc_odp_caps:
                                        NO SUPPORT
        completion_timestamp_mask not supported
        core clock not supported
        device_cap_flags_ex:            0x201400
        tso_caps:
                max_tso:                        0
        rss_caps:
                max_rwq_indirection_tables:                     0
                max_rwq_indirection_table_size:                 0
                rx_hash_function:                               0x0
                rx_hash_fields_mask:                            0x0
        max_wq_type_rq:                 0
        packet_pacing_caps:
                qp_rate_limit_min:      0kbps
                qp_rate_limit_max:      0kbps
        tag matching not supported
        num_comp_vectors:               1
                port:   1
                        state:                  PORT_DOWN (1)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet
                        max_msg_sz:             0x7fffffff
                        port_cap_flags:         0x04010000
                        port_cap_flags2:        0x0000
                        max_vl_num:             1 (1)
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           1
                        gid_tbl_len:            6
                        subnet_timeout:         0
                        init_type_reply:        0
                        active_width:           1X (1)
                        active_speed:           2.5 Gbps (1)
                        phys_state:             LINK_UP (5)
                        GID[  0]:               fe80::250:56ff:fe9b:71e5, RoCE v2


I am not able to grant remote access to the VM.

Sincerely,
[Description: LL]
Nicholas Clark
MIT Lincoln Laboratory
ISR and Tactical Systems Division
Embedded and Open Systems Group
Systems Administration
244 Wood St., S3-487
Lexington, MA 02421-6426
(O): 781-981-9342
nicholas.clark at ll.mit.edu<mailto:nicholas.clark at ll.mit.edu>

From: Subramoni, Hari <subramoni.1 at osu.edu<mailto:subramoni.1 at osu.edu>>
Sent: Wednesday, December 1, 2021 11:12 AM
To: Clark, Nicholas - 1002 - MITLL <Nicholas.Clark at ll.mit.edu<mailto:Nicholas.Clark at ll.mit.edu>>
Cc: mvapich-discuss at lists.osu.edu<mailto:mvapich-discuss at lists.osu.edu>; Subramoni, Hari <subramoni.1 at osu.edu<mailto:subramoni.1 at osu.edu>>
Subject: RE: [Mvapich-discuss] MVAPICH 2.3.6 HCA Support for VMXNET3 or PVRDMA VMware Network Adapters

Hi, Nicholas.

MVAPICH2 has support for running in VMs.

Could you please send us the output of ibv_devinfo -v on the VM? Do you think it is possible to get temporary remote access to debug the problem ourselves?

Best,
Hari.

From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu<mailto:mvapich-discuss-bounces at lists.osu.edu>> On Behalf Of Clark, Nicholas - 1002 - MITLL via Mvapich-discuss
Sent: Tuesday, November 30, 2021 4:49 PM
To: mvapich-discuss at lists.osu.edu<mailto:mvapich-discuss at lists.osu.edu>
Subject: [Mvapich-discuss] MVAPICH 2.3.6 HCA Support for VMXNET3 or PVRDMA VMware Network Adapters

Does MVPAICH2 have support for running in VMs with VMXNET3 or the PVRDMA network adapter that supports RoCE v1/v2?

Currently with the default build parameters and native rdma-core libraries on RHEL 8.5, I am seeing this message about unknown HCA on both VMXNET3 and PVRDMA:

[rdma_open_hca] Unknown HCA type: this build of MVAPICH2 does notfully support the HCA found on the system (try with other build options)
[cli_1]: aborting job:
Fatal error in MPI_Init:
Other MPI error, error stack:
MPIR_Init_thread(493)............:
MPID_Init(419)...................: channel initialization failed
MPIDI_CH3_Init(470)..............: rdma_get_control_parameters
rdma_get_control_parameters(1925): rdma_open_hca
rdma_open_hca(1080)..............: Failed to open HCA: No such file or directory

Sincerely,
[Description: LL]
Nicholas Clark
MIT Lincoln Laboratory
ISR and Tactical Systems Division
Embedded and Open Systems Group
Systems Administration
244 Wood St., S3-487
Lexington, MA 02421-6426
(O): 781-981-9342
nicholas.clark at ll.mit.edu<mailto:nicholas.clark at ll.mit.edu>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20211201/7d608ba7/attachment-0022.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 1621 bytes
Desc: image001.jpg
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20211201/7d608ba7/attachment-0022.jpg>


More information about the Mvapich-discuss mailing list