EXT :Re: [mvapich-discuss] QP failed: Cannot allocate memory
Devendar Bureddy
bureddy at cse.ohio-state.edu
Fri Mar 23 13:43:52 EDT 2012
Hi Doug
Good to know that Firmware update helped you. Let us know how it goes
across the cluster.
-Devendar
On Fri, Mar 23, 2012 at 1:29 PM, Riley, Douglas (AS)
<Douglas.Riley at ngc.com> wrote:
> Dear Devendar,
>
> Thank you for your reply. On my cluster, I have two other IB boards, which I updated yesterday to FW version 2.9.1. The VIADEV_USE_XRC=1 variable now works on those two nodes. I've been in contact with Mellanox to find a similar FW update for the board listed below, which appears to have a non-standard PSID. Once Mellanox provides a new FW (hopefully at the 2.9.1 version), I'll further advise if XRC=1 now works across the cluster and (hopefully) resolves the QP failure issue -- at least to some increased level. Thank you for your recommendations.
>
> Best,
> Doug
>
>
>
> -----Original Message-----
> From: Devendar Bureddy [mailto:bureddy at cse.ohio-state.edu]
> Sent: Thursday, March 22, 2012 10:38 PM
> To: Riley, Douglas (AS)
> Cc: mvapich-discuss at cse.ohio-state.edu
> Subject: EXT :Re: [mvapich-discuss] QP failed: Cannot allocate memory
>
> Hi Doug
>
> I could not get any details about this card (MT_0D30110008) to see if
> this support XRC or not. May be you can check with Mellanox regarding
> this.
> As you indicated, you are really running at very high over
> subscription, not sure if this is causing any network resource
> limitation at firmware or hardware level.
>
> There could also be a communication progress issue if some of the MPI
> processes not get CPU resource. Are you disabled the process
> affinity while running with over subscribe mode? If not, can you try
> with disable process affinity ( VIADEV_USE_AFFINITY = 0).
>
> I would also recommend you to try with MVAPICH2 latest
> revision(mvapich2-1.8rc1) with XRC to see if QP creations issue goes
> away. In MV2, XRC can be enabled using runtime option MV2_USE_XRC=1.
>
> You can also try with scalable MVAPICH2 UD-Hybrid transport. You can
> find more details about this in the user guide section :
> http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.8rc1.html#x1-590006.7
> You can use MV2_HYBRID_ENABLE_THRESHOLD=1 and MV2_HYBRID_MAX_RC_CONN=0
> to run with only UD transport. This will avoid the creation of RC
> queue pairs(QPs) and will decrease the required number of QP from N2
> to N.
>
> -Devendar
>
>
>
> On Thu, Mar 22, 2012 at 12:07 PM, Riley, Douglas (AS)
> <Douglas.Riley at ngc.com> wrote:
>> MVAPICH Team:
>>
>> I'm currently using:
>> MVAPICH 1.2-SingleRail
>> Build-ID: 3635
>>
>> My cluster has 6 nodes, each node with 48 AMD Opteron cores and each node
>> with 192 GB RAM. I'm running RHEL 5.5 Linux version 2.6.35
>>
>> My applications often use MVAPICH to significantly oversubscribe the
>> available cores (288). Up to about -n 1200, all works fine under
>> mpirun_rsh; however, at about -n 1250, I receive the terminal error:
>>
>> QP failed: Cannot allocate memory
>>
>> As described in the User Manual, I've increase the memlock to the maximum
>> memory on each node; however, the problem persists. If I invoke the
>> environment variable: VIADEV_USE_XRC=1, the error at startup doesn't
>> appear; however the application code then hangs indefinitely (which occurs
>> for either small or large MPI applications). XRC apparently may solve the
>> issue; however, either my MVAPICH version was not built to support, or
>> perhaps may hardware doesn't support it. The following is output from the
>> IB adapter:
>>
>> hca_id: mlx4_0
>> transport: InfiniBand (0)
>> fw_ver: 2.7.000
>> node_guid: 0002:c903:000b:9b1c
>> sys_image_guid: 0002:c903:000b:9b1f
>> vendor_id: 0x02c9
>> vendor_part_id: 26428
>> hw_ver: 0xB0
>> board_id: MT_0D30110008
>> phys_port_cnt: 1
>> max_mr_size: 0xffffffffffffffff
>> page_size_cap: 0xfffffe00
>> max_qp: 261056
>> max_qp_wr: 16351
>> device_cap_flags: 0x007c9c76
>> max_sge: 32
>> max_sge_rd: 0
>> max_cq: 65408
>> max_cqe: 4194303
>> max_mr: 524272
>> max_pd: 32764
>> max_qp_rd_atom: 16
>> max_ee_rd_atom: 0
>> max_res_rd_atom: 4176896
>> max_qp_init_rd_atom: 128
>> max_ee_init_rd_atom: 0
>> atomic_cap: ATOMIC_HCA (1)
>> max_ee: 0
>> max_rdd: 0
>> max_mw: 0
>> max_raw_ipv6_qp: 0
>> max_raw_ethy_qp: 1
>> max_mcast_grp: 8192
>> max_mcast_qp_attach: 56
>> max_total_mcast_qp_attach: 458752
>> max_ah: 0
>> max_fmr: 0
>> max_srq: 65472
>> max_srq_wr: 16383
>> max_srq_sge: 31
>> max_pkeys: 128
>> local_ca_ack_delay: 15
>> port: 1
>> state: PORT_ACTIVE (4)
>> max_mtu: 2048 (4)
>> active_mtu: 2048 (4)
>> sm_lid: 1
>> port_lid: 1
>> port_lmc: 0x00
>> link_layer: IB
>> max_msg_sz: 0x40000000
>> port_cap_flags: 0x0251086a
>> max_vl_num: 8 (4)
>> bad_pkey_cntr: 0x0
>> qkey_viol_cntr: 0x0
>> sm_sl: 0
>> pkey_tbl_len: 128
>> gid_tbl_len: 128
>> subnet_timeout: 18
>> init_type_reply: 0
>> active_width: 4X (2)
>> active_speed: 5.0 Gbps (2)
>> phys_state: LINK_UP (5)
>> GID[ 0]:
>> fe80:0000:0000:0000:0002:c903:000b:9b1d
>>
>>
>> Any recommendations to enable larger number of MPI processes on my hardware
>> would be most appreciated.
>>
>> Many Thanks,
>> Doug
>>
>> ------------------------
>> Douglas J Riley, PhD
>>
>>
>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>
>
>
> --
> Devendar
--
Devendar
More information about the mvapich-discuss
mailing list