<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
ZQ,</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
This warning will appear anytime you allocate a full subscription job with the UCX netmod enabled. It is based on observed behaviour of UCX, which we have seen allocated an extra progress thread for each process that calls
<code>ucx_init</code>. I cannot guarantee this will happen in every case, especially if you are using your own UCX version, but we have observed it in most if not all of our testing with UCX. If you are not seeing any performance impacts and can observe no
oversubscription, then I would encourage you to disregard the warning and trust what you are observing.</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Regarding your second question, the warning is coming from MVAPICH, so that is why it does not appear in OpenMPI. I cannot speak to how OpenMPI handles these additional UCX progress threads or what if anything they might do to notify a user.</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thanks,<br>
Nat</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> You, Zhi-Qiang <zyou@osc.edu><br>
<b>Sent:</b> Thursday, October 31, 2024 12:11<br>
<b>To:</b> Shineman, Nat <shineman.5@osu.edu>; mvapich-discuss@lists.osu.edu <mvapich-discuss@lists.osu.edu><br>
<b>Subject:</b> Re: Handling MVAPICH 3.0 full subscription warning</font>
<div> </div>
</div>
<style>
<!--
@font-face
{font-family:PMingLiU}
@font-face
{font-family:"Cambria Math"}
@font-face
{font-family:Calibri}
@font-face
{font-family:Aptos}
@font-face
{}
p.x_MsoNormal, li.x_MsoNormal, div.x_MsoNormal
{margin:0in;
font-size:12.0pt;
font-family:"Aptos",sans-serif}
p.x_MsoListParagraph, li.x_MsoListParagraph, div.x_MsoListParagraph
{margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
font-size:12.0pt;
font-family:"Aptos",sans-serif}
p.x_xmsonormal, li.x_xmsonormal, div.x_xmsonormal
{margin:0in;
font-size:12.0pt;
font-family:"Aptos",sans-serif}
span.x_EmailStyle21
{font-family:"Aptos",sans-serif;
color:windowtext}
.x_MsoChpDefault
{font-size:10.0pt}
@page WordSection1
{margin:1.0in 1.0in 1.0in 1.0in}
div.x_WordSection1
{}
ol
{margin-bottom:0in}
ul
{margin-bottom:0in}
-->
</style>
<div lang="EN-US" link="#467886" vlink="#96607D" style="word-wrap:break-word">
<div class="x_WordSection1">
<p class="x_MsoNormal"><span style="font-size:11.0pt">Hi Nat,</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">Thank you for the suggestion. I have a few questions:</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<ol start="1" type="1" style="margin-top:0in">
<li class="x_MsoListParagraph" style="margin-left:0in"><span style="font-size:11.0pt">Does this message indicate that oversubscription is occurring, or is it simply a warning that appears every time a full-node job is run? In one user’s case, I did not observe
any oversubscription, although the warning was present.</span></li><li class="x_MsoListParagraph" style="margin-left:0in"><span style="font-size:11.0pt">UCX is also the default for OpenMPI, but I did not see a similar warning when running a full-node job with OpenMPI. Why does this only happen with MVAPICH?</span></li></ol>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">Thank you,</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">ZQ</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<div id="x_mail-editor-reference-message-container">
<div>
<div>
<div style="border:none; border-top:solid #B5C4DF 1.0pt; padding:3.0pt 0in 0in 0in">
<p class="x_MsoNormal" style="margin-bottom:12.0pt"><b><span style="color:black">From:
</span></b><span style="color:black">Shineman, Nat <shineman.5@osu.edu><br>
<b>Date: </b>Wednesday, October 16, 2024 at 2:16</span><span style="font-family:"Arial",sans-serif; color:black"> </span><span style="color:black">PM<br>
<b>To: </b>mvapich-discuss@lists.osu.edu <mvapich-discuss@lists.osu.edu>, You, Zhi-Qiang <zyou@osc.edu><br>
<b>Subject: </b>Re: Handling MVAPICH 3.0 full subscription warning</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="color:black">Hi ZQ,</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="color:black"> </span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="color:black">You are probably seeing degraded performance because you are still running the application at full subscription and requesting that MVAPICH reserve 2 cores per process. The warning should probably more accurately
state that you should cap your runs at 1/2 subscription and set the listed environment variable. This would prevent you from oversubscribing cores. </span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="color:black"> </span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="color:black">However, if you are seeing satisfactory performance with oversubscribed cores in full subscription, please feel free to ignore the warning.</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="color:black"> </span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="color:black">Thanks,</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="color:black">Nat</span></p>
</div>
<div class="x_MsoNormal" align="center" style="text-align:center">
<hr size="2" width="98%" align="center">
</div>
<div id="x_divRplyFwdMsg">
<p class="x_MsoNormal"><b><span style="font-size:11.0pt; font-family:"Calibri",sans-serif; color:black">From:</span></b><span style="font-size:11.0pt; font-family:"Calibri",sans-serif; color:black"> Mvapich-discuss <mvapich-discuss-bounces@lists.osu.edu> on
behalf of You, Zhi-Qiang via Mvapich-discuss <mvapich-discuss@lists.osu.edu><br>
<b>Sent:</b> Wednesday, October 16, 2024 11:50<br>
<b>To:</b> mvapich-discuss@lists.osu.edu <mvapich-discuss@lists.osu.edu><br>
<b>Subject:</b> [Mvapich-discuss] Handling MVAPICH 3.0 full subscription warning</span>
</p>
<div>
<p class="x_MsoNormal"> </p>
</div>
</div>
<div>
<div>
<p class="x_xmsonormal"><span style="font-size:11.0pt">Hi,</span></p>
<p class="x_xmsonormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_xmsonormal"><span style="font-size:11.0pt">We encountered the following warning message while running a full-node MPI job with MVAPICH 3.0:</span></p>
<p class="x_xmsonormal"><span style="font-size:11.0pt"><br>
[][mvp_generate_implicit_cpu_mapping] WARNING: You appear to be running at full subscription for this job. UCX spawns an additional thread for each process which may result in oversubscribed cores and poor performance. Please consider reserving at least 2 cores
per node for the additional threads, enabling SMT, or setting MVP_THREADS_PER_PROCESS=2 to ensure that sufficient resources are available.<br>
<br>
The suggestion to set MVP_THREADS_PER_PROCESS=2 not only fails to improve performance but actually degrades it. Can this warning message be safely ignored, or is there any action I need to take to address it?</span></p>
<p class="x_xmsonormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_xmsonormal"><span style="font-size:11.0pt">Best,</span></p>
<p class="x_xmsonormal"><span style="font-size:11.0pt">ZQ</span></p>
<p class="x_xmsonormal"><span style="font-size:11.0pt"> </span></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>