[Mvapich-discuss] Azure HBv4 mpi failure
Paniraja Guptha, Akshay
panirajaguptha.1 at osu.edu
Thu Jul 11 15:35:39 EDT 2024
Hi Nicky,
Can you please try the attached patch?
-Akshay
From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> On Behalf Of Paniraja Guptha, Akshay via Mvapich-discuss
Sent: Monday, July 1, 2024 11:38 AM
To: Sandhu, Prabhjot(Nicky)@DWR <Prabhjot.Sandhu at water.ca.gov>; Announcement about MVAPICH (MPI over InfiniBand, RoCE, Omni-Path, Slingshot, iWARP and EFA) Libraries developed at NBCL/OSU <mvapich-discuss at lists.osu.edu>
Subject: Re: [Mvapich-discuss] Azure HBv4 mpi failure
Hi Nicky,
Thanks for bringing this to our attention. We will take a look at the issue and get back to you.
-Akshay Paniraja Guptha
From: Mvapich-discuss <mvapich-discuss-bounces+panirajaguptha.1=osu.edu at lists.osu.edu<mailto:mvapich-discuss-bounces+panirajaguptha.1=osu.edu at lists.osu.edu>> On Behalf Of Sandhu, Prabhjot(Nicky)@DWR via Mvapich-discuss
Sent: Monday, July 1, 2024 11:09 AM
To: mvapich-discuss at lists.osu.edu<mailto:mvapich-discuss at lists.osu.edu>
Subject: [Mvapich-discuss] Azure HBv4 mpi failure
I compiled my code against the lastest alma linux 8. 7 and mvapich2-2. 3. 7-1 on Azure. The code performs very well when using HBv2-series or HBv3-series, however it fails when using HBv4-series with the following warning at start of the mpirun
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
Report Suspicious <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/KGKeukY!vwQd8KYtD6YBRRdwHw6EV0aWa55VqchYa6vUfyrQOMdRLL_8lHrZBbPbYN_-gN834RuZELsLWzwMp5nBiL-SlwKReoIcVyPNuJQofSkwUMvSkKQC0UOv_yL6Qg_TEb6BpkGtdkzViUYLNmIPCWt0qQ$>
ZjQcmQRYFpfptBannerEnd
I compiled my code against the lastest alma linux 8.7 and mvapich2-2.3.7-1 on Azure. The code performs very well when using HBv2-series<https://urldefense.com/v3/__https:/learn.microsoft.com/en-us/azure/virtual-machines/sizes/high-performance-compute/hb-family*hbv2-series__;Iw!!KGKeukY!wHkEMZ0eG8-_lzRbW3pQoiNeTm2zvI6k4mCGcQ5RhL_zSzxaLb28swQvFn_sXZm35ID-u19N9dXDw0rWbGB0sUpj2J05VChdBNmn6MzFmg$> or HBv3-series<https://urldefense.com/v3/__https:/learn.microsoft.com/en-us/azure/virtual-machines/sizes/high-performance-compute/hb-family*hbv3-series__;Iw!!KGKeukY!wHkEMZ0eG8-_lzRbW3pQoiNeTm2zvI6k4mCGcQ5RhL_zSzxaLb28swQvFn_sXZm35ID-u19N9dXDw0rWbGB0sUpj2J05VChdBNl1qZThrw$>, however it fails when using HBv4-series<https://urldefense.com/v3/__https:/learn.microsoft.com/en-us/azure/virtual-machines/sizes/high-performance-compute/hb-family*hbv4-series__;Iw!!KGKeukY!wHkEMZ0eG8-_lzRbW3pQoiNeTm2zvI6k4mCGcQ5RhL_zSzxaLb28swQvFn_sXZm35ID-u19N9dXDw0rWbGB0sUpj2J05VChdBNmJDg7Qiw$> with the following warning at start of the mpirun after which the application code also fails.
[get_link_speed] Invalid link speed 128
Has anyone seen this message? Are there any env vars or config vars to be set?
Nicky
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20240711/72407b3e/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ndr.patch
Type: application/octet-stream
Size: 527 bytes
Desc: ndr.patch
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20240711/72407b3e/attachment-0002.obj>
More information about the Mvapich-discuss
mailing list