[Mvapich-discuss] Possible bug in OSU Micro-Benchmarks 5.7.1 with cuda

Subramoni, Hari subramoni.1 at osu.edu
Mon Jun 28 16:49:06 EDT 2021


Hi, Adam.

Thanks a lot for identifying this and providing the patch. I've created a slightly modified version of the patch. Could you please let me know if this works as expected for you?

I will make similar changes for other benchmarks as well.

Thx,
Hari. 

-----Original Message-----
From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> On Behalf Of Goldman, Adam via Mvapich-discuss
Sent: Monday, June 28, 2021 2:57 PM
To: mvapich-discuss at lists.osu.edu
Cc: Rimmer, Todd <todd.rimmer at intel.com>
Subject: [Mvapich-discuss] Possible bug in OSU Micro-Benchmarks 5.7.1 with cuda

Hello,

While running the latest osu latency (5.7.1) benchmark with cuda support enabled, we encountered a possible bug in the OSU benchmark code.
It appears that when running osu_latency with one side using "MH" and the other "H", the non-cuda managed side will attempt to call cuda calls above message size 131072.

I am using OpenMPI v4.1.1 compiled with cuda support on RHEL 8.1

# mpirun -np 2 --host host1,host2 ./mpi/pt2pt/osu_latency -m 131072: MH H ...
131072                512.90
[../../util/osu_util_mpi.c:1691] CUDA call 'cudaMemPrefetchAsync(buf, length, devid, um_stream)' failed with 1: invalid argument

>From some debugging, it appears to be passing in a pointer to code allocated without cuda calls on the node that is not using cuda.
This issue appears to be new to v5.7.1. 

Not sure if this is the fix, but this seemed to fix the issue on osu_latency.c:
================================
@@ -134,9 +134,9 @@

         for(i = 0; i < options.iterations + options.skip; i++) {  #ifdef _ENABLE_CUDA_
-            if (options.src == 'M') {
+            if (myid == 0) {
                 touch_managed_src(s_buf, size);
-            } else if (options.dst == 'M') {
+            } else {
                 touch_managed_dst(s_buf, size);
             }
 #endif
@@ -149,8 +149,8 @@
                 MPI_CHECK(MPI_Send(s_buf, size, MPI_CHAR, 1, 1, MPI_COMM_WORLD));
                 MPI_CHECK(MPI_Recv(r_buf, size, MPI_CHAR, 1, 1, MPI_COMM_WORLD, &reqstat));  #ifdef _ENABLE_CUDA_
-                if (options.src == 'M') {
-                    touch_managed_src(r_buf, size);
-                }
+                touch_managed_src(r_buf, size);
 #endif

@@ -161,9 +161,7 @@
             } else if (myid == 1) {
                 MPI_CHECK(MPI_Recv(r_buf, size, MPI_CHAR, 0, 1, MPI_COMM_WORLD, &reqstat));  #ifdef _ENABLE_CUDA_
-                if (options.dst == 'M') {
-                    touch_managed_dst(r_buf, size);
-                }
+                touch_managed_dst(r_buf, size);
 #endif

                 MPI_CHECK(MPI_Send(s_buf, size, MPI_CHAR, 0, 1, MPI_COMM_WORLD)); ================================ Only the 1st change is required, but the last 2 are more just cleanups to avoid calling the same if expression twice. Once outside and once inside the functions.

Thank you,

Adam Goldman
HPC Fabric Software Engineer
Intel Corporation
adam.goldman at intel.com

_______________________________________________
Mvapich-discuss mailing list
Mvapich-discuss at lists.osu.edu
https://lists.osu.edu/mailman/listinfo/mvapich-discuss
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: omb_managed_mem_osu_latency_patch.txt
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20210628/8201b92c/attachment-0022.txt>


More information about the Mvapich-discuss mailing list