<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Lucida Calligraphy";
panose-1:3 1 1 1 1 1 1 1 1 1;}
@font-face
{font-family:"Arial Narrow";
panose-1:2 11 6 6 2 2 2 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Courier New";}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:#1F497D;
font-weight:normal;
font-style:normal;
text-decoration:none none;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:#1F497D">Hi sending this again,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">We are still seeing issues with he latest osu_bw benchmarks.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">-Adam<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> Mvapich-discuss <mvapich-discuss-bounces+adam.goldman=intel.com@lists.osu.edu>
<b>On Behalf Of </b>Goldman, Adam via Mvapich-discuss<br>
<b>Sent:</b> Tuesday, November 2, 2021 11:04 AM<br>
<b>To:</b> mvapich-discuss@lists.osu.edu<br>
<b>Cc:</b> DAmbrosio, Cody J <cody.j.dambrosio@intel.com>; Bodner, Anton <anton.bodner@intel.com>; Rimmer, Todd <todd.rimmer@intel.com><br>
<b>Subject:</b> [Mvapich-discuss] osu_bw segfault when running with CUDA accelerator and managed buffers<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hello,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hopefully you can help, we may have uncovered an issue in the latest osu_bw test (v5.8). It seems to crash when given the arguments below, while v5.7 with the exact same arguments and communications stack works fine.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Command: <o:p></o:p></p>
<pre style="background:whitesmoke"><span style="color:black">mpirun --mca mtl ofi -np 2 -H gpu01,gpu02 ./osu-micro-benchmarks-5.8/mpi/pt2pt/osu_bw </span><span style="font-size:9.0pt;color:#333333">--accelerator cuda M M<o:p></o:p></span></pre>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">If we remove the “<span style="font-size:9.0pt;color:#333333">--accelerator cuda</span>” argument, that seems to work.<o:p></o:p></p>
<p class="MsoNormal">Also, osu_latency and others appear to work without issue. <o:p>
</o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">BackTrace:<o:p></o:p></p>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">(gdb) bt<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#0 0x000014bae79c9c7a in __memmove_sse2_unaligned_erms () from /lib64/libc.so.6<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#1 0x000014bae8d1a557 in ?? () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#2 0x000014bae8d1a5bc in ?? () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#3 0x000014bae8efd2e2 in ?? () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#4 0x000014bae8d1e851 in ?? () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#5 0x000014bae8d716cc in ?? () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#6 0x000014bae8f0fd47 in ?? () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#7 0x000014bae8d3280e in ?? () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#8 0x000014bae8d33514 in ?? () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#9 0x000014bae8f45c0f in ?? () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#10 0x000014bae8d83cd7 in cuMemsetD8_v2 () from /lib64/libcuda.so.1<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#11 0x000014baea27f460 in ?? () from /usr/local/cuda/lib64/libcudart.so.11.0<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#12 0x000014baea25b132 in ?? () from /usr/local/cuda/lib64/libcudart.so.11.0<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#13 0x000014baea29c88e in cudaMemset () from /usr/local/cuda/lib64/libcudart.so.11.0<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#14 0x00000000004068a3 in set_buffer_pt2pt (buffer=<optimized out>, rank=<optimized out>, type=<optimized out>, data=<optimized out>, size=<optimized out>)<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333"> at ../../util/osu_util_mpi.c:829<o:p></o:p></span></pre>
<pre style="background:whitesmoke"><span style="font-size:9.0pt;color:#333333">#15 0x00000000004028a5 in main (argc=<optimized out>, argv=<optimized out>) at osu_bw.c:136<o:p></o:p></span></pre>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">We have reproduced this repeatably on several systems with different CUDA versions and GPU hardware.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Regards,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Lucida Calligraphy";color:#0070C0">Adam Goldman<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Arial Narrow",sans-serif;color:#767171">HPC Fabric Software Engineer<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:#767171">Intel Corporation<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#2F5496"><a href="mailto:adam.goldman@intel.com">adam.goldman@intel.com</a><o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</body>
</html>