<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hi Aditya,</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Can you let me know the output of <code>ldd</code> on your <code>libmpi.so</code>? I am guessing you are not correctly linked to the Cray libfabric library. In that case the cxi provider will not be available. We typically find that the correct libfabric
installation is installed somwhere like <code>/opt/cray/libfabric</code> but that is on a per system basis. This can be selected at configure time with
<code>--with-libfabric=<path/to/libfabric></code>. Alternatively, you can use <code>
LD_PRELOAD</code> to adjust the libfabric installation at runtime. This is a critical requirement on all Cray systems because the only way to access the Slingshot11 network is through their proprietary OFI installation.
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thanks,</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Nat<br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Kashi, Aditya <kashia@ornl.gov><br>
<b>Sent:</b> Friday, January 19, 2024 12:31<br>
<b>To:</b> Shineman, Nat <shineman.5@osu.edu>; mvapich-discuss@lists.osu.edu <mvapich-discuss@lists.osu.edu><br>
<b>Cc:</b> Matheson, Michael <mathesonma@ornl.gov><br>
<b>Subject:</b> Re: [Mvapich-discuss] libpmi2 could not be found while building for Slurm</font>
<div> </div>
</div>
<div>
<div style="display:none!important; display:none; visibility:hidden; font-size:1px; color:#ffffff; line-height:1px; height:0px; max-height:0px; opacity:0; overflow:hidden">
Hi Nat, Thank you for the quick reply. Indeed, with those flags set, now the app fails with Abort(2665871) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack: MPIR_Init_thread(175). . . . . . . : MPID_Init(597). . . . . . . . . . . . . . :
</div>
<div style="display:none!important; display:none; visibility:hidden; font-size:1px; color:#ffffff; line-height:1px; height:0px; max-height:0px; opacity:0; overflow:hidden">
ZjQcmQRYFpfptBannerStart</div>
<div dir="ltr" lang="en" id="x_pfptBannerelaz2db" style="display:block!important; text-align:left!important; margin:16px 0px 16px 0px!important; padding:8px 16px 8px 16px!important; border-radius:4px!important; min-width:200px!important; background-color:#CFD3D7!important; background-color:#CFD3D7; border-top:4px solid #8c8e91!important; border-top:4px solid #8c8e91">
<div id="x_pfptBannerelaz2db" style="float:left!important; display:block!important; margin:0px 0px 1px 0px!important; max-width:600px!important">
<div id="x_pfptBannerelaz2db" style="display:block!important; visibility:visible!important; background-color:#CFD3D7!important; color:#000000!important; color:#000000; font-family:'Arial',sans-serif!important; font-family:'Arial',sans-serif; font-weight:bold!important; font-weight:bold; font-size:14px!important; line-height:18px!important; line-height:18px">
This Message Is From an External Sender </div>
<div id="x_pfptBannerelaz2db" style="display:block!important; visibility:visible!important; background-color:#CFD3D7!important; color:#000000!important; color:#000000; font-weight:normal; font-family:'Arial',sans-serif!important; font-family:'Arial',sans-serif; font-size:12px!important; line-height:18px!important; line-height:18px; margin-top:2px!important">
This message came from outside your organization. </div>
</div>
<div id="x_pfptBannerelaz2db" style="float:right!important; display:block!important; display:block; margin:0px 0px 0px 16px!important; text-align:right!important; width:fit-content!important">
<a id="x_pfptBannerelaz2db" href="https://us-phishalarm-ewt.proofpoint.com/EWT/v1/KGKeukY!vwQfsoZND6YBRRdx_q5E-YzCJRiQZNoNDrMFoUmHFuXh8T3DK1V99ijEbPcwvPH6J3WSjW2mCyAI0oEqL2faCNPLOESDOu-9meICpeQ57piDO5E4V0yjRgc8WuJ27zcE6XjK4mLlTIn9$" style="display:inline-block!important; text-decoration:none">
<div class="x_pfptPrimaryButtonelaz2db" style="display:inline-block!important; display:inline-block; visibility:visible!important; opacity:1!important; color:#000000!important; color:#000000; font-family:'Arial',sans-serif!important; font-family:'Arial',sans-serif; font-size:14px!important; font-weight:normal!important; text-decoration:none!important; border-radius:2px!important; padding:7.5px 16px!important; margin:3px 0 3px 16px!important; white-space:nowrap!important; width:fit-content!important; border:1px solid #666666">
Report Suspicious </div>
</a></div>
<div style="clear:both!important; display:block!important; visibility:hidden!important; line-height:0!important; font-size:0.01px!important; height:0px">
</div>
</div>
<div style="display:none!important; display:none; visibility:hidden; font-size:1px; color:#ffffff; line-height:1px; height:0px; max-height:0px; opacity:0; overflow:hidden">
ZjQcmQRYFpfptBannerEnd</div>
<style>
<!--
#x_pfptBannerelaz2db
{display:block!important;
visibility:visible!important;
opacity:1!important;
background-color:#CFD3D7!important;
max-width:none!important;
max-height:none!important}
-->
</style><style type="text/css" style="display:none">
<!--
p
{margin-top:0;
margin-bottom:0}
-->
</style>
<div class="x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Hi Nat,</div>
<div class="x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div class="x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Thank you for the quick reply. Indeed, with those flags set, now the app fails with</div>
<div class="x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div class="x_elementToProof"><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Abort(2665871) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error
stack:</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">MPIR_Init_thread(175).......:</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">MPID_Init(597)..............:</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">MPIDI_MVP_mpi_init_hook(289):</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">MPIDI_OFI_mpi_init_hook(637):</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">open_fabric(1338)...........:</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">find_provider(1431).........: OFI fi_getinfo() failed (ofi_init.c:1431:find_provider:No data available)</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">I guess that means it can't detect the CXI provider. Do you have any guess about where the issue might lie?</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">It is a GPU app. I'll take a look at MVAPICH-plus.</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Best,</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Aditya</span></div>
<div id="x_appendonsend"></div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Shineman, Nat <shineman.5@osu.edu><br>
<b>Sent:</b> Friday, January 19, 2024 12:20 PM<br>
<b>To:</b> Kashi, Aditya <kashia@ornl.gov>; mvapich-discuss@lists.osu.edu <mvapich-discuss@lists.osu.edu><br>
<b>Cc:</b> Matheson, Michael <mathesonma@ornl.gov><br>
<b>Subject:</b> [EXTERNAL] Re: [Mvapich-discuss] libpmi2 could not be found while building for Slurm</font>
<div> </div>
</div>
<style type="text/css" style="display:none">
<!--
p
{margin-top:0;
margin-bottom:0}
-->
</style>
<div dir="ltr">
<div class="x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Hi Aditya, <br>
</div>
<div class="x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div class="x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
To get performance on par with Cray MPICH you will need to run with <code>MPIR_CVAR_OFI_USE_PROVIDER=cxi</code> to make sure that MVAPICH correctly detects the cxi provider. For good measure you can also set
<code>FI_PROVIDER=cxi</code> to force OFI to only allow this provider to be used. This will cause the application for fail if MVAPICH does not correctly identify the slingshot provider. With these cvars set you should see the performance you expect on CPU
applications. For GPU applications, you will need to use our MVAPICH-Plus library, available on our downloads page.
<br>
</div>
<div class="x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div class="x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Regarding the sub communicator issue, I will take a look at the reproducer and get back to you.
<br>
</div>
<div class="x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div class="x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Thanks,</div>
<div class="x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Nat<br>
</div>
<div id="x_x_appendonsend"></div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Kashi, Aditya <kashia@ornl.gov><br>
<b>Sent:</b> Friday, January 19, 2024 12:16<br>
<b>To:</b> Shineman, Nat <shineman.5@osu.edu>; mvapich-discuss@lists.osu.edu <mvapich-discuss@lists.osu.edu><br>
<b>Cc:</b> Matheson, Michael <mathesonma@ornl.gov><br>
<b>Subject:</b> Re: [Mvapich-discuss] libpmi2 could not be found while building for Slurm</font>
<div> </div>
</div>
<div>
<div style="display:none!important; display:none; visibility:hidden; font-size:1px; color:#ffffff; line-height:1px; height:0px; max-height:0px; opacity:0; overflow:hidden">
Hi Nat, Thank you for the suggestion! I managed to get the following build: FFLAGS=-fallow-argument-mismatch ./configure --prefix=. . /Programs/mvapich-3. 0rc-rocm543-slurm --with-device=ch4: ofi --enable-rocm --with-rocm=$ROCM_PATH --with-ch4-shmmods=gpudirect
</div>
<div style="display:none!important; display:none; visibility:hidden; font-size:1px; color:#ffffff; line-height:1px; height:0px; max-height:0px; opacity:0; overflow:hidden">
</div>
<style>
<!--
#x_x_x_pfptBannere4olfrb
{display:block!important;
visibility:visible!important;
opacity:1!important;
background-color:#CFD3D7!important;
max-width:none!important;
max-height:none!important}
-->
</style><style type="text/css" style="display:none">
<!--
p
{margin-top:0;
margin-bottom:0}
-->
</style>
<div class="x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Hi Nat,</div>
<div class="x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div class="x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Thank you for the suggestion! I managed to get the following build:</div>
<div class="x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div><span style="font-family:"Aptos Mono",Aptos_EmbeddedFont,Aptos_MSFontService,monospace; font-size:12pt; color:rgb(0,0,0)">FFLAGS=-fallow-argument-mismatch ./configure --prefix=../Programs/mvapich-3.0rc-rocm543-slurm --with-device=ch4:ofi --enable-rocm
--with-rocm=$ROCM_PATH --with-ch4-shmmods=gpudirect --with-pm=slurm --with-pmi=cray --with-hwloc-prefix=$HWLOC_DIR</span></div>
<div class="x_x_x_elementToProof"><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div class="x_x_x_elementToProof"><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">However, the application runs much more slowly compared to Cray MPICH, and more importantly,
reduce and allreduce fail on subcommunicators created by <code>MPI_Comm_split</code> when the subcommunicator spans more than one node. The code I wrote to test this is here:
<a href="https://urldefense.com/v3/__https://urldefense.us/v2/url?u=https-3A__urldefense.com_v3_-5F-5Fhttps-3A__bitbucket.org_Slaedr_mpi-2Dhip-2Dtest-2Dsuite_src_main_-5F-5F-3B-21-21KGKeukY-21wnfn-2DhY901VbPEn68hyZgt6tCeZ7R2Qh0CqMJ846ccmwTk3hV1dxppbxecA4thkf1tDxq9g7i-2DaMpEHceg-24&d=DwMFaQ&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=SF2YipDJY2dwSsQ76LTjoA&m=pkRMgOuY1r34KU-HoW8mZwUlvVKn22z2F-gbnLpqin4IBgclP5jwanLH7ZAYjNrA&s=ZnKf9tAVj4CwbasCTGUSz5Vxq6kczb25wBCi4F_DtoE&e=__;!!KGKeukY!2U4jCS3WacaAISCJZ-e3QzOpBza2_RwJTAnBd-dFiURuCdE4QBJueDgZJDL-81iDx6-QG6zgSvrU0q7hjg$" id="LPlnk206765" class="x_x_x_OWAAutoLink">
https://bitbucket.org/Slaedr/mpi-hip-test-suite/src/main/</a> It's a simple CMake build with
<code>MPI_HOME</code> pointing to the MPI install directory. I ran the <code>comm_reduce</code> test using </span></div>
<div class="x_x_x_elementToProof"><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><code>srun -n 16 -c 7 build/test/comm_reduce gpu</code> </span></div>
<div class="x_x_x_elementToProof"><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">on two nodes. The code essentially separates ranks 3 through 15 into a separate communicator
and calls <code>MPI_Allreduce</code> on that communicator. However, there's a segfault in the
<code>MPI_Allreduce</code>. When the communicator is <code>MPI_COMM_WORLD</code>, this works fine though.</span></div>
<div class="x_x_x_elementToProof"><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div class="x_x_x_elementToProof"><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Should I try some other build setting for MVAPICH? Please let me know if I should provide
any more details.</span></div>
<div class="x_x_x_elementToProof"><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Best,</span></div>
<div class="x_x_x_elementToProof"><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Aditya</span></div>
<div id="x_x_x_appendonsend"></div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_x_x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Shineman, Nat <shineman.5@osu.edu><br>
<b>Sent:</b> Wednesday, January 17, 2024 1:02 PM<br>
<b>To:</b> mvapich-discuss@lists.osu.edu <mvapich-discuss@lists.osu.edu>; Kashi, Aditya <kashia@ornl.gov><br>
<b>Cc:</b> Matheson, Michael <mathesonma@ornl.gov><br>
<b>Subject:</b> [EXTERNAL] Re: [Mvapich-discuss] libpmi2 could not be found while building for Slurm</font>
<div> </div>
</div>
<style type="text/css" style="display:none">
<!--
p
{margin-top:0;
margin-bottom:0}
-->
</style>
<div dir="ltr">
<div class="x_x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Hi Aditya,</div>
<div class="x_x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div class="x_x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
We have found that typically Cray systems use a different version of PMI/PMI2 than is found on other slurm installations. Can you please try building with
<code>--with-pmi=cray</code> instead of giving the pmi2 path? This has been more successful in most tests that we have tried. If you are still having issues, sometimes you need to also add
<code>--with-craypmi=<path/to/craypmi/dir></code> to ensure the right version is picked up.</div>
<div class="x_x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div class="x_x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Thanks,</div>
<div class="x_x_x_x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Nat</div>
<div id="x_x_x_x_appendonsend"></div>
<hr style="display:inline-block; width:98%">
<div id="x_x_x_x_divRplyFwdMsg" dir="ltr"><span style="font-family:Calibri,sans-serif; font-size:11pt; color:rgb(0,0,0)"><b>From:</b> Mvapich-discuss <mvapich-discuss-bounces+shineman.5=osu.edu@lists.osu.edu> on behalf of Kashi, Aditya via Mvapich-discuss <mvapich-discuss@lists.osu.edu><br>
<b>Sent:</b> Tuesday, January 9, 2024 18:18<br>
<b>To:</b> mvapich-discuss@lists.osu.edu <mvapich-discuss@lists.osu.edu><br>
<b>Cc:</b> Matheson, Michael <mathesonma@ornl.gov><br>
<b>Subject:</b> [Mvapich-discuss] libpmi2 could not be found while building for Slurm</span>
<div> </div>
</div>
<div dir="ltr" lang="en" id="x_x_x_x_x_pfptBannerzmut7kn" style="visibility:visible!important; opacity:1!important; max-width:none!important; max-height:none!important; display:block!important; text-align:left!important; margin:16px 0px!important; padding:8px 16px!important; border-radius:4px!important; min-width:200px!important; border-top:4px solid rgb(140,142,145)!important">
<div id="x_x_x_x_x_pfptBannerzmut7kn" style="visibility:visible!important; opacity:1!important; max-height:none!important; float:left!important; display:block!important; margin:0px 0px 1px!important; max-width:600px!important">
<div id="x_x_x_x_x_pfptBannerzmut7kn" style="opacity:1!important; max-width:none!important; max-height:none!important; display:block!important; visibility:visible!important; font-family:"Arial",sans-serif!important; font-weight:bold!important; font-size:14px!important; line-height:18px!important">
This Message Is From an External Sender</div>
<div id="x_x_x_x_x_pfptBannerzmut7kn" style="opacity:1!important; max-width:none!important; max-height:none!important; display:block!important; visibility:visible!important; font-weight:normal; font-family:"Arial",sans-serif!important; font-size:12px!important; line-height:18px!important; margin-top:2px!important">
This message came from outside your organization.</div>
</div>
<div id="x_x_x_x_x_pfptBannerzmut7kn" style="visibility:visible!important; opacity:1!important; max-width:none!important; max-height:none!important; float:right!important; display:block!important; margin:0px 0px 0px 16px!important; text-align:right!important; width:fit-content!important">
<div style="white-space:nowrap; margin:3px 0px 3px 16px; padding:7.5px 16px; border-width:1px; border-style:solid; border-color:rgb(102,102,102); border-radius:2px; width:fit-content; display:inline-block">
<span style="font-family:"Arial",sans-serif; font-size:14px; color:rgb(0,0,0); background-color:rgb(207,211,215)"><a href="https://urldefense.com/v3/__https://urldefense.us/v2/url?u=https-3A__urldefense.com_v3_-5F-5Fhttps-3A__urldefense.us_v2_url-3Fu-3Dhttps-2D3A-5F-5Fus-2D2Dphishalarm-2D2Dewt.proofpoint.com-5FEWT-5Fv1-5FKGKeukY-2D21siQZ16bhKayAAJnRn86E3cDMzsH8lNbryju8HZ7WNS4aDJ0T0cxv1kzTpsKK5ugjGyuGPD1VUmTJs6tjOQrCvUk4ZjkielXdW2XhjtqF6Zg4Ot3gIlXbQ1E-2D24-26d-3DDwMGaQ-26c-3Dv4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-2DO7C4ViYc-26r-3DSF2YipDJY2dwSsQ76LTjoA-26m-3DEo2KFWr55ERu1pJFgpaICYMtENZSz4GSwER9UPX-2DDnzcpNhxejQqF4ZC2O2GN-5FPr-26s-3DpVYX09R9nWI9VyhQnbWYLPYWWggW8xM7Ko83LKqo06E-26e-3D-5F-5F-3B-21-21KGKeukY-21wnfn-2DhY901VbPEn68hyZgt6tCeZ7R2Qh0CqMJ846ccmwTk3hV1dxppbxecA4thkf1tDxq9g7i-2DYPQ4G2-5Fg-24&d=DwMFaQ&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=SF2YipDJY2dwSsQ76LTjoA&m=pkRMgOuY1r34KU-HoW8mZwUlvVKn22z2F-gbnLpqin4IBgclP5jwanLH7ZAYjNrA&s=Scgkaqs1usxvvnUPG3R4hlQyMpfpdHNV7lZFg5QHAGE&e=__;!!KGKeukY!2U4jCS3WacaAISCJZ-e3QzOpBza2_RwJTAnBd-dFiURuCdE4QBJueDgZJDL-81iDx6-QG6zgSvpvFToZmQ$" id="x_x_x_x_x_pfptBannerzmut7kn" class="x_x_x_x_OWAAutoLink" data-auth="NotApplicable" data-loopstyle="linkonly" style="text-decoration:none; display:inline-block; max-width:none; max-height:none; background-color:rgb(207,211,215)">Report Suspicious</a></span></div>
</div>
<div style="line-height:0; height:0px; display:block; font-size:0.01px"> </div>
</div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Hi everyone,</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">I'm trying to build MVAPICH 3.0rc on a Cray Shasta system with Slurm. This is my current build line:</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">FFLAGS=-fallow-argument-mismatch ./configure --with-pmi=pmi2 --enable-slurm --with-pm=slurm --enable-rocm --with-rocm=$ROCM_PATH
--with-libfabric=/opt/cray/libfabric/1.15.2.0 --with-pmi2-libdir=/usr/lib64/slurmpmi</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">It's able to find pmi2.h, but not libpmi2.so:</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">...</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">configure: RUNNING CONFIGURE FOR src/pm/slurm</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">checking for srun... /usr/bin/srun</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">checking slurm/pmi2.h usability... yes</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">checking slurm/pmi2.h presence... yes</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">checking for slurm/pmi2.h... yes</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">checking for /usr/include/slurm/pmi2.h... yes</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">./configure: line 60908: found: command not found</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">checking for PMI2_Init in -lpmi2... no</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">configure: error: could not find the slurm libpmi2. Configure aborted</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">However, I can see the file /usr/lib64/slurmpmi/libpmi2.so, which is symlinked to /usr/lib64/slurmpmi/libpmi2.so.0.0.0.
I've tried variations of the last flag like "--with-pmi-libdir", "--with-pmi2-lib=.../libpmi2.so" etc.</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Is there a known way to build MVAPICH with a scalable backend on this kind of system? Getting the best possible performance
at scale is absolutely necessary for what I'm trying to do.</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Thanks,</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Aditya Kashi</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Analytics and AI Methods at Scale</span></div>
<div><span style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Oak Ridge National Laboratory</span></div>
</div>
</div>
</div>
</div>
</body>
</html>