<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Adam, <br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Sorry for the delay here. After some internal experimentation we have discovered that OMB was not internally handling the
<code>--accelerator</code> option correctly. This option should not be used with the pt2pt tests since the two buffers are set with the H/D/MH/MD arguments. However, instead of being ignored as it should, it was causing the benchmark to enter an alternate
code path and breaking. <br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Please continue running your experiments without the <code>--accelerator=cuda</code> option and you should get the desired results. In our next release we have included a fix that will ignore this option for pt2pt tests and have added an acknowledgement to
you in the Changelog. <br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
As a side note, the <code>M</code> buffer option is deprecated. Please use either
<code>MH</code> or <code>MD</code> (managed host and managed device respectively) instead.
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Please let me know if you have any questions. <br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Thanks, <br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Nat<br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Mvapich-discuss <mvapich-discuss-bounces+shineman.5=osu.edu@lists.osu.edu> on behalf of Goldman, Adam via Mvapich-discuss <mvapich-discuss@lists.osu.edu><br>
<b>Sent:</b> Friday, November 12, 2021 09:10<br>
<b>To:</b> mvapich-discuss@lists.osu.edu <mvapich-discuss@lists.osu.edu><br>
<b>Cc:</b> DAmbrosio, Cody J <cody.j.dambrosio@intel.com>; Rimmer, Todd <todd.rimmer@intel.com>; Bodner, Anton <anton.bodner@intel.com><br>
<b>Subject:</b> Re: [Mvapich-discuss] osu_bw segfault when running with CUDA accelerator and managed buffers</font>
<div> </div>
</div>
<style>
<!--
@font-face
{font-family:"Cambria Math"}
@font-face
{font-family:Calibri}
@font-face
{font-family:"Lucida Calligraphy"}
@font-face
{font-family:"Arial Narrow"}
p.x_MsoNormal, li.x_MsoNormal, div.x_MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif}
a:link, span.x_MsoHyperlink
{color:#0563C1;
text-decoration:underline}
pre
{margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New"}
span.x_HTMLPreformattedChar
{font-family:"Courier New"}
span.x_EmailStyle21
{font-family:"Calibri",sans-serif;
color:#1F497D;
font-weight:normal;
font-style:normal;
text-decoration:none none}
.x_MsoChpDefault
{font-size:10.0pt}
@page WordSection1
{margin:1.0in 1.0in 1.0in 1.0in}
div.x_WordSection1
{}
-->
</style>
<div lang="EN-US" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="x_WordSection1">
<p class="x_MsoNormal"><span style="color:#1F497D">Hi sending this again,</span></p>
<p class="x_MsoNormal"><span style="color:#1F497D"> </span></p>
<p class="x_MsoNormal"><span style="color:#1F497D">We are still seeing issues with he latest osu_bw benchmarks.</span></p>
<p class="x_MsoNormal"><span style="color:#1F497D"> </span></p>
<p class="x_MsoNormal"><span style="color:#1F497D">-Adam</span></p>
<p class="x_MsoNormal"><span style="color:#1F497D"> </span></p>
<div style="border:none; border-left:solid blue 1.5pt; padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none; border-top:solid #E1E1E1 1.0pt; padding:3.0pt 0in 0in 0in">
<p class="x_MsoNormal"><b>From:</b> Mvapich-discuss <mvapich-discuss-bounces+adam.goldman=intel.com@lists.osu.edu>
<b>On Behalf Of </b>Goldman, Adam via Mvapich-discuss<br>
<b>Sent:</b> Tuesday, November 2, 2021 11:04 AM<br>
<b>To:</b> mvapich-discuss@lists.osu.edu<br>
<b>Cc:</b> DAmbrosio, Cody J <cody.j.dambrosio@intel.com>; Bodner, Anton <anton.bodner@intel.com>; Rimmer, Todd <todd.rimmer@intel.com><br>
<b>Subject:</b> [Mvapich-discuss] osu_bw segfault when running with CUDA accelerator and managed buffers</p>
</div>
</div>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">Hello,</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">Hopefully you can help, we may have uncovered an issue in the latest osu_bw test (v5.8). It seems to crash when given the arguments below, while v5.7 with the exact same arguments and communications stack works fine.</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">Command: </p>
<pre style="background:whitesmoke"><span style="color:black">mpirun --mca mtl ofi -np 2 -H gpu01,gpu02 ./osu-micro-benchmarks-5.8/mpi/pt2pt/osu_bw </span><span style="font-size:9.0pt; color:#333333">--accelerator cuda M M</span></pre>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">If we remove the “<span style="font-size:9.0pt; color:#333333">--accelerator cuda</span>” argument, that seems to work.</p>
<p class="x_MsoNormal">Also, osu_latency and others appear to work without issue.
</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">BackTrace:</p>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">(gdb) bt</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#0 0x000014bae79c9c7a in __memmove_sse2_unaligned_erms () from /lib64/libc.so.6</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#1 0x000014bae8d1a557 in ?? () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#2 0x000014bae8d1a5bc in ?? () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#3 0x000014bae8efd2e2 in ?? () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#4 0x000014bae8d1e851 in ?? () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#5 0x000014bae8d716cc in ?? () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#6 0x000014bae8f0fd47 in ?? () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#7 0x000014bae8d3280e in ?? () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#8 0x000014bae8d33514 in ?? () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#9 0x000014bae8f45c0f in ?? () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#10 0x000014bae8d83cd7 in cuMemsetD8_v2 () from /lib64/libcuda.so.1</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#11 0x000014baea27f460 in ?? () from /usr/local/cuda/lib64/libcudart.so.11.0</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#12 0x000014baea25b132 in ?? () from /usr/local/cuda/lib64/libcudart.so.11.0</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#13 0x000014baea29c88e in cudaMemset () from /usr/local/cuda/lib64/libcudart.so.11.0</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#14 0x00000000004068a3 in set_buffer_pt2pt (buffer=<optimized out>, rank=<optimized out>, type=<optimized out>, data=<optimized out>, size=<optimized out>)</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333"> at ../../util/osu_util_mpi.c:829</span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt; color:#333333">#15 0x00000000004028a5 in main (argc=<optimized out>, argv=<optimized out>) at osu_bw.c:136</span></pre>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">We have reproduced this repeatably on several systems with different CUDA versions and GPU hardware.</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">Regards,</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal"><span style="font-size:14.0pt; font-family:"Lucida Calligraphy"; color:#0070C0">Adam Goldman</span></p>
<p class="x_MsoNormal"><span style="font-family:"Arial Narrow",sans-serif; color:#767171">HPC Fabric Software Engineer</span></p>
<p class="x_MsoNormal"><span style="font-family:"Arial",sans-serif; color:#767171">Intel Corporation</span></p>
<p class="x_MsoNormal"><span style="color:#2F5496"><a href="mailto:adam.goldman@intel.com">adam.goldman@intel.com</a></span></p>
<p class="x_MsoNormal"> </p>
</div>
</div>
</div>
</body>
</html>