[Mvapich-discuss] Possible bug in OSU Micro-Benchmarks 5.7.1 with cuda

Subramoni, Hari subramoni.1 at osu.edu
Wed Jun 30 10:56:42 EDT 2021


Hi, Adam.

Thanks for getting back to us. Glad to hear that it works as expected now. We have updated the patch for the other point to point benchmarks and have attached the patch here.

This will be available with the next release of OMB.

Best,
Hari.

-----Original Message-----
From: Goldman, Adam <adam.goldman at intel.com> 
Sent: Tuesday, June 29, 2021 9:47 AM
To: Subramoni, Hari <subramoni.1 at osu.edu>
Cc: Rimmer, Todd <todd.rimmer at intel.com>; mvapich-discuss at lists.osu.edu
Subject: RE: [Mvapich-discuss] Possible bug in OSU Micro-Benchmarks 5.7.1 with cuda

Thank you for the quick response. The last patch appears to work. I tested with all combos of 'H' 'D' 'MH' 'MD'.

-----Original Message-----
From: Subramoni, Hari <subramoni.1 at osu.edu> 
Sent: Monday, June 28, 2021 5:00 PM
To: Goldman, Adam <adam.goldman at intel.com>
Cc: Rimmer, Todd <todd.rimmer at intel.com>; mvapich-discuss at lists.osu.edu; Subramoni, Hari <subramoni.1 at osu.edu>
Subject: RE: [Mvapich-discuss] Possible bug in OSU Micro-Benchmarks 5.7.1 with cuda

Please use this version. I sent the wrong version with the last e-mail. I will check for correctness and commit an appropriate patch. This will be available with our next release with an acknowledgement to you.

Thx,
Hari.

-----Original Message-----
From: Subramoni, Hari <subramoni.1 at osu.edu> 
Sent: Monday, June 28, 2021 4:49 PM
To: Goldman, Adam <adam.goldman at intel.com>
Cc: Rimmer, Todd <todd.rimmer at intel.com>; mvapich-discuss at lists.osu.edu; Subramoni, Hari <subramoni.1 at osu.edu>
Subject: RE: [Mvapich-discuss] Possible bug in OSU Micro-Benchmarks 5.7.1 with cuda

Hi, Adam.

Thanks a lot for identifying this and providing the patch. I've created a slightly modified version of the patch. Could you please let me know if this works as expected for you?

I will make similar changes for other benchmarks as well.

Thx,
Hari. 

-----Original Message-----
From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> On Behalf Of Goldman, Adam via Mvapich-discuss
Sent: Monday, June 28, 2021 2:57 PM
To: mvapich-discuss at lists.osu.edu
Cc: Rimmer, Todd <todd.rimmer at intel.com>
Subject: [Mvapich-discuss] Possible bug in OSU Micro-Benchmarks 5.7.1 with cuda

Hello,

While running the latest osu latency (5.7.1) benchmark with cuda support enabled, we encountered a possible bug in the OSU benchmark code.
It appears that when running osu_latency with one side using "MH" and the other "H", the non-cuda managed side will attempt to call cuda calls above message size 131072.

I am using OpenMPI v4.1.1 compiled with cuda support on RHEL 8.1

# mpirun -np 2 --host host1,host2 ./mpi/pt2pt/osu_latency -m 131072: MH H ...
131072                512.90
[../../util/osu_util_mpi.c:1691] CUDA call 'cudaMemPrefetchAsync(buf, length, devid, um_stream)' failed with 1: invalid argument

>From some debugging, it appears to be passing in a pointer to code allocated without cuda calls on the node that is not using cuda.
This issue appears to be new to v5.7.1. 

Not sure if this is the fix, but this seemed to fix the issue on osu_latency.c:
================================
@@ -134,9 +134,9 @@

         for(i = 0; i < options.iterations + options.skip; i++) {  #ifdef _ENABLE_CUDA_
-            if (options.src == 'M') {
+            if (myid == 0) {
                 touch_managed_src(s_buf, size);
-            } else if (options.dst == 'M') {
+            } else {
                 touch_managed_dst(s_buf, size);
             }
 #endif
@@ -149,8 +149,8 @@
                 MPI_CHECK(MPI_Send(s_buf, size, MPI_CHAR, 1, 1, MPI_COMM_WORLD));
                 MPI_CHECK(MPI_Recv(r_buf, size, MPI_CHAR, 1, 1, MPI_COMM_WORLD, &reqstat));  #ifdef _ENABLE_CUDA_
-                if (options.src == 'M') {
-                    touch_managed_src(r_buf, size);
-                }
+                touch_managed_src(r_buf, size);
 #endif

@@ -161,9 +161,7 @@
             } else if (myid == 1) {
                 MPI_CHECK(MPI_Recv(r_buf, size, MPI_CHAR, 0, 1, MPI_COMM_WORLD, &reqstat));  #ifdef _ENABLE_CUDA_
-                if (options.dst == 'M') {
-                    touch_managed_dst(r_buf, size);
-                }
+                touch_managed_dst(r_buf, size);
 #endif

                 MPI_CHECK(MPI_Send(s_buf, size, MPI_CHAR, 0, 1, MPI_COMM_WORLD)); ================================ Only the 1st change is required, but the last 2 are more just cleanups to avoid calling the same if expression twice. Once outside and once inside the functions.

Thank you,

Adam Goldman
HPC Fabric Software Engineer
Intel Corporation
adam.goldman at intel.com

_______________________________________________
Mvapich-discuss mailing list
Mvapich-discuss at lists.osu.edu
https://lists.osu.edu/mailman/listinfo/mvapich-discuss
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: omb_managed_mem_osu_latency_patch_v2.txt
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20210630/a64c708a/attachment-0022.txt>


More information about the Mvapich-discuss mailing list