[Mvapich-discuss] Issue while Installation of Mvapich2-x-advanced for OSU-INAM

Lieber, Matt lieber.31 at osu.edu
Wed Jan 24 11:15:32 EST 2024


Hi John,
First you will need Slurm.  Second the rpms that have slurm in their name for mvapich2x will use srun as their job launcher and not mpirun_rsh.  Also, our user guide under section 5 has some other useful information that will be required for the information to show up https://mvapich.cse.ohio-state.edu/userguide/osu-inam/#_running_example .

-Matt
________________________________
From: evancervj <evancervj at cdac.in>
Sent: Wednesday, January 24, 2024 1:55 AM
To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>; Lieber, Matt <lieber.31 at osu.edu>
Subject: Re: [Mvapich-discuss] Issue while Installation of Mvapich2-x-advanced for OSU-INAM

HI Matt,   Thanks for the suggestions. I had tried MVAPICH2X-Basic as well as using the flag --nodeps for MVAPICH2-X-Advanced. The installation is successful. I was able to see the fabric related information like network topology and the
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
<https://us-phishalarm-ewt.proofpoint.com/EWT/v1/KGKeukY!vaQTHSdh4-YgClmS367kF4pSK3tDgFUeRd0RyZWCtZV5aEGNZOm8pIHYFghhwpDBVWq12d7S3Xzdu3qbTujN11_-CxQh3-aiw_9Ix3mF6fEoiM0s9FxNRDGQ7ZojIaw$>
Report Suspicious

ZjQcmQRYFpfptBannerEnd
HI Matt,

Thanks for the suggestions. I had tried MVAPICH2X-Basic as well as using the flag --nodeps for MVAPICH2-X-Advanced. The installation is successful. I was able to see the fabric related information like network topology and the run-time packet counter information on the INAM web interface.

However process level and MPI related information was not seen on the INAM web interface. To export MPI related information to OSU_INAM, I tried running application using MVAPICH2X, but the mpirun_rsh file was missing in the MVAPICH2X installation directory.
Both the MVAPICH2X packages(basic and advanced) did not provide the files like mpirun_rsh, mpirun.., to run applications.


rpm -Uvh --nodeps mvapich2-x-advanced-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:mvapich2-x-advanced-mofed5.0-gnu4################################# [100%]

ll /opt/mvapich2-x/gnu4.8.5/mofed5.0/advanced/slurm/bin/
total 132
lrwxrwxrwx 1 root root     6 Jan 23 10:27 mpic++ -> mpicxx
-rwxr-xr-x 1 root root 10970 Jan 23 10:27 mpicc
-rwxr-xr-x 1 root root 12856 Jun  2  2021 mpichversion
-rwxr-xr-x 1 root root 10503 Jan 23 10:27 mpicxx
-rwxr-xr-x 1 root root 14218 Jan 23 10:27 mpif77
-rwxr-xr-x 1 root root 14218 Jan 23 10:27 mpif90
-rwxr-xr-x 1 root root 14278 Jun  2  2021 mpifort
-rwxr-xr-x 1 root root 12928 Jun  2  2021 mpiname
-rwxr-xr-x 1 root root 23256 Jun  2  2021 mpivars
-rwxr-xr-x 1 root root  3430 Jun  2  2021 parkill



rpm -Uvh mvapich2-x-basic-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:mvapich2-x-basic-mofed5.0-gnu4.8.################################# [100%]

ll /opt/mvapich2-x/gnu4.8.5/mofed5.0/basic/slurm/bin/
total 132
lrwxrwxrwx 1 root root     6 Jan 24 11:54 mpic++ -> mpicxx
-rwxr-xr-x 1 root root 10756 Jan 24 11:54 mpicc
-rwxr-xr-x 1 root root 12856 May 20  2021 mpichversion
-rwxr-xr-x 1 root root 10324 Jan 24 11:54 mpicxx
-rwxr-xr-x 1 root root 14036 Jan 24 11:54 mpif77
-rwxr-xr-x 1 root root 14036 Jan 24 11:54 mpif90
-rwxr-xr-x 1 root root 14096 May 20  2021 mpifort
-rwxr-xr-x 1 root root 12928 May 20  2021 mpiname
-rwxr-xr-x 1 root root 23256 May 20  2021 mpivars
-rwxr-xr-x 1 root root  3430 May 20  2021 parkill


I was able to access the fabric information on the OSU-INAM interface. I wanted to leverage the OSU-INAM features to observe the process level and MPI level information as well, but failed because of the above mentioned issue. Any pointers on this will be helpful.

Also my current environment do not have Slurm or any other job scheduler installed. So Is Slurm necessary for OSU-INAM to export MPI level information?

Thanks
John


On January 20, 2024 at 6:36 AM "Lieber, Matt" <lieber.31 at osu.edu> wrote:
Hi John,
Sorry for the delay in getting back to you.  There are multiple options that could fix this.  If you do not plan to use the sharp functionality adding the flag --nodeps should fix the issue you are seeing. Also MVAPICH2-X basic should also work with INAM.  If you do wish to use sharp we will have to send a new rpm.  Please let let us know how these options work for you.

Thanks,
Matt

________________________________
From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> on behalf of V John via Mvapich-discuss <mvapich-discuss at lists.osu.edu>
Sent: Tuesday, January 16, 2024 7:41 AM
To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>
Subject: [Mvapich-discuss] Issue while Installation of Mvapich2-x-advanced for OSU-INAM

Hi everyone, I'm trying to use OSU-INAM on our cluster with Mellanox-IB ConnectX-6 interconnect with machines having Centos7. 6 OS. I downloaded the osu-inam(osu-inam-mysql-1. 0-1. el7. x86_64. rpm) package matching the environment which also required

Hi everyone,

I'm trying to use OSU-INAM on our cluster with Mellanox-IB ConnectX-6 interconnect with machines having Centos7.6 OS. I downloaded the osu-inam(osu-inam-mysql-1.0-1.el7.x86_64.rpm) package matching the environment which also required MOFED 5+. So I installed MLNX_OFED_LINUX-5.0-1.0.0.0 package on the systems.


Osu-Inam user-guide mentioned the requirement of Mvapich2-x-advanced for getting MPI usage related information. Hence, I tried to install MVAPICH2-X-advanced. The only binary matching the environment available on the Mvapich website is mvapich2-x-advanced-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64.rpm. The installation fails however due to dependency on libsharp_coll.so.4.

rpm -Uvh  mvapich2-x-advanced-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64.rpm
error: Failed dependencies:
libsharp_coll.so.4()(64bit) is needed by mvapich2-x-advanced-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64


The libsharp_coll library versions under the /opt/mellanox/sharp/lib directory created post the installation of MLNX_OFED_LINUX-5.0-1.0.0.0 misses the required version, libsharp_coll.so.4.

ls /opt/mellanox/sharp/lib | grep libsharp_coll.so
libsharp_coll.so
libsharp_coll.so.5
libsharp_coll.so.5.0.1


Is there a fix to this issue..?

Thanks
John
HPC Technologies Group
C-DAC pune

------------------------------------------------------------------------------------------------------------
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://www.facebook.com/CDACINDIA<https://urldefense.com/v3/__https://www.facebook.com/CDACINDIA__;!!KGKeukY!2KJphDWGFEcrgBZNFD0pyWfL_tLCH64l43moQyPTHdZMlRXL2sAhibE3ALe1Uzxh39TVZU1r-xrOX_sKpGycM_VAyzT0TQ$> & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
------------------------------------------------------------------------------------------------------------



------------------------------------------------------------------------------------------------------------
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://www.facebook.com/CDACINDIA<https://urldefense.com/v3/__https://www.facebook.com/CDACINDIA__;!!KGKeukY!xP_sp9xIt5MlF-mGh9nukFmdaGDl-o3tQ8iYqzLQpVIRRub9-TeUbJ4uW-GqCJC54Y1hZ38BrLaABtnLmSWZ$> & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
------------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20240124/0a7d7ae1/attachment-0002.html>


More information about the Mvapich-discuss mailing list