[Mvapich-discuss] GENTLE_REMINDER: Issue while Installation of Mvapich2-x-advanced for OSU-INAM

evancervj evancervj at cdac.in
Fri Apr 12 00:57:47 EDT 2024


Hi Pouya Kousha,

Thanks for the reply. I will try the suggestion given.

Thanks
John


On April 5, 2024 at 10:03 PM "Kousha, Pouya" <kousha.2 at buckeyemail.osu.edu>
wrote:

> 
>  Hi John,
> 
> 
> 
>  Sorry in the delay of getting back to you.
> 
> 
> 
>  I tested our RPMs using this link
> <http://mvapich.cse.ohio-state.edu/download/mvapich/mv2x/2.3/mofed5.0/mvapich2-x-basic-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64.rpm>
> as well as building from source.
> 
>  With srun instead of exporting common, we need to export each variable
> individually. So instead of this line below export each variable individually.
> 
>    export common="MV2_TWO_LEVEL_COMM_THRESHOLD=1,MV2_USE_RDMA_CM=0,\
> 
> 
>                               MV2_TOOL_REPORT_PVARS=1,MV2_DEBUG_TOOL_VERBOSE=1,\
> 
>                                MV2_USE_SHMEM_COLL=1,MV2_ENABLE_PVAR_TIMER=1,\
> 
> 
>                               MV2_ENABLE_PVAR_COUNTER=1,MV2_ENABLE_PVAR_TIMER_BUCKETS=1,\
> 
> 
>                               MV2_ENABLE_PVAR_COUNTER_BUCKETS=1,MV2_TOOL_REPORT_SESSIONS=1,\
> 
>                                MV2_TOOL_SESSIONS_DEFAULT_ALL_HANDLES=1,\
> 
> 
>                               MV2_TOOL_REPORT_LUSTRE_STATS=0,MV2_ON_DEMAND_THRESHOLD=1,\
> 
> 
>                               MV2_TOOL_INFO_FILE_PATH=/etc/osu-inam/osu-inam.conf"
> 
> 
> 
>  I tested against INAM with both MySQL and ClickHouse Databases and I was able
> to see the information on the INAM web page. The RPMs are available here
> <http://mvapich.cse.ohio-state.edu/downloads/>
> 
> 
> 
>  Kindly let me know if this would resolve your issue. Thanks!
> 
> 
> 
>  I am listing the content of the test file I used to run MVAPICH2-x and
> sending information to OSU INAM below:
> 
> 
> 
>  export PPN=8
> 
>  export N=$SLURM_NNODES
> 
>  export n=$(($N*$PPN))
> 
> 
> 
>  export
> LD_LIBRARY_PATH=/home/kousha.2/test-rpm/opt/mvapich2-x/gnu4.8.5/mofed5.0/basic/slurm/lib64:$LD_LIBRARY_PATH
> 
>  export
> PATH=/home/kousha.2/test-rpm/opt/mvapich2-x/gnu4.8.5/mofed5.0/basic/slurm/bin:$PATH
> 
>  #export
> LD_LIBRARY_PATH=/home/kousha.2/projects/master-x/mvapich2/install-dir/lib:$LD_LIBRARY_PATH
> 
>  #export PATH=/home/kousha.2/projects/master-x/mvapich2/install-dir/bin:$PATH
> 
>  export MPI="/home/kousha.2/projects/master-x/mvapich2/install-dir"
> 
>  export OMB="/home/kousha.2/projects/master-x/mvapich2/install-dir"
> 
> 
> 
>  export MV2_TOOL_INFO_FILE_PATH="/home/kousha.2/inam.conf"
> 
>  export HOSTFILE=$PWD/hostfile.$SLURM_JOBID
> 
> 
> 
>  echo "mpi=$MPI"
> 
>  echo "($(which mpicc))"
> 
> 
> 
>  echo "path=$PATH"
> 
>  echo "LD=$LD_LIBRARY_PATH"
> 
>  echo 'START'
> 
> 
> 
>  export MV2_TWO_LEVEL_COMM_THRESHOLD=1
> 
>  export MV2_USE_RDMA_CM=0
> 
>  export MV2_TOOL_REPORT_PVARS=1
> 
>  #Disable DEBUG running in production
> 
>  export MV2_DEBUG_TOOL_VERBOSE=2
> 
>  export MV2_USE_SHMEM_COLL=1
> 
>  export MV2_ENABLE_PVAR_TIMER=1
> 
>  export MV2_ENABLE_PVAR_COUNTER=1
> 
>  export MV2_ENABLE_PVAR_TIMER_BUCKETS=1
> 
>  export MV2_ENABLE_PVAR_COUNTER_BUCKETS=1
> 
>  export MV2_TOOL_REPORT_SESSIONS=1
> 
>  export MV2_TOOL_SESSIONS_DEFAULT_ALL_HANDLES=1
> 
>  export MV2_TOOL_REPORT_LUSTRE_STATS=0
> 
>  export MV2_ON_DEMAND_THRESHOLD=1
> 
> 
> 
> 
> 
>  for((i=0; i<100; i++))
> 
>  do
> 
>      echo " srun --mpi=pmi2 --export=ALL -N $N -n $n
> $OMB/libexec/osu-micro-benchmarks/mpi/collective/osu_allreduce -i 5000   "
> 
> 
> 
>      srun --mpi=pmi2 --export=ALL -N $N -n $n
> $OMB/libexec/osu-micro-benchmarks/mpi/collective/osu_allreduce -i 5000
> 
>  done
> 
> 
> 
> 
> 
>  Best,
> 
>  Pouya Kousha
> 
> 
> 
>  From: evancervj <evancervj at cdac.in>
>  Date: Tuesday, March 26, 2024 at 6:16 AM
>  To: Kousha, Pouya <kousha.2 at buckeyemail.osu.edu>, Announcement about MVAPICH2
> (MPI over InfiniBand, RoCE, Omni-Path, iWARP and EFA) Libraries developed at
> NBCL/OSU <mvapich-discuss at lists.osu.edu>, Lieber, Matt <lieber.31 at osu.edu>
>  Subject: GENTLE_REMINDER: [Mvapich-discuss] Issue while Installation of
> Mvapich2-x-advanced for OSU-INAM
> 
>    On February 21, 2024 at 2: 44 PM evancervj <evancervj@ cdac. in> wrote: Hi
> Pouya Kousha,   Sorry for delay in replying with the log files gathered during
> the trials.   The following changes were made in inamd. conf. Once
> 
> 
> 
> 
> 
> 
>  On February 21, 2024 at 2:44 PM evancervj <evancervj at cdac.in> wrote:
> 
>   > > 
> >   Hi Pouya Kousha,
> > 
> > 
> > 
> >   Sorry for delay in replying with the log files gathered during the trials.
> > 
> > 
> > 
> >   The following changes were made in inamd.conf. Once the changes were made,
> > the osu-inamd and osu-inamweb services were restarted.
> > 
> >     ##### INAM debug flags #####
> > 
> >     INAM_DEBUG_INIT_VERBOSE=1
> >     INAM_DEBUG_TIME_VERBOSE=1
> >     INAM_DEBUG_MAIN_VERBOSE=2
> >     INAM_DEBUG_DB_VERBOSE=2
> >     INAM_DEBUG_NW_VERBOSE=1
> >     INAM_DEBUG_FB_VERBOSE=1
> > 
> >   The trial is being done on a single node where the needed packages and
> > dependencies are installed. This test node(rt03d) is connected to a Mellanox
> > switch and this switch is connected to three other systems.
> > 
> >   The application used in the trial is IMB on the test node. The following
> > commands were fired to run the application using srun with the MV2 flags
> > that were suggested:
> > 
> > 
> > 
> >     export common="MV2_TWO_LEVEL_COMM_THRESHOLD=1,MV2_USE_RDMA_CM=0,\
> > 
> > 
> >                                MV2_TOOL_REPORT_PVARS=1,MV2_DEBUG_TOOL_VERBOSE=1,\
> > 
> > 
> >                                MV2_USE_SHMEM_COLL=1,MV2_ENABLE_PVAR_TIMER=1,\
> > 
> > 
> >                                MV2_ENABLE_PVAR_COUNTER=1,MV2_ENABLE_PVAR_TIMER_BUCKETS=1,\
> > 
> > 
> >                                MV2_ENABLE_PVAR_COUNTER_BUCKETS=1,MV2_TOOL_REPORT_SESSIONS=1,\
> > 
> >                                 MV2_TOOL_SESSIONS_DEFAULT_ALL_HANDLES=1,\
> > 
> > 
> >                                MV2_TOOL_REPORT_LUSTRE_STATS=0,MV2_ON_DEMAND_THRESHOLD=1,\
> > 
> > 
> >                                MV2_TOOL_INFO_FILE_PATH=/etc/osu-inam/osu-inam.conf"
> > 
> > 
> > 
> >     (time srun --mpi=pmi2 -n12 --export=$common -v \
> > 
> >       ./IMB-MPI1 -msglen 1Mfile_4K_inc -npmin 12 2>&1) 2>&1 | tee
> > /tmp/log_IMB-MPI1_OSU_INAM
> > 
> > 
> > 
> >   The osu-inam.conf file created is as follows:
> > 
> > 
> > 
> >      MV2_TOOL_QPN=4719
> >      MV2_TOOL_LID=3
> >      MV2_TOOL_COUNTER_INTERVAL=3
> >      MV2_TOOL_REPORT_CPU_UTIL=1
> >      MV2_TOOL_REPORT_MEM_UTIL=1
> >      MV2_TOOL_REPORT_IO_UTIL=1
> >      MV2_TOOL_REPORT_COMM_GRID=1
> >      MV2_TOOL_REPORT_LUSTRE_STATS=0
> >      MV2_TOOL_REPORT_PVARS=1
> > 
> >   Please find attached the log files related to osu-inam.conf,
> > osu-inamd.conf, /var/log/messages and IMB(application) output.
> > 
> > 
> > 
> >   Thanks for the continued support in this issue.
> > 
> > 
> > 
> >   -John
> > 
> > 
> > 
> > 
> >   On February 14, 2024 at 9:43 PM "Kousha, Pouya"
> > <kousha.2 at buckeyemail.osu.edu> wrote:
> > 
> >    > > > 
> > >    Hi John,
> > > 
> > > 
> > > 
> > >    I hope this message finds you well. My name is Pouya Kousha, and I'm
> > > the lead developer for OSU INAM. Thank you for reaching out to us. To
> > > provide you with the most accurate support, we need to gather more
> > > detailed information about the issue you're encountering.
> > > 
> > > 
> > > 
> > >    Could you please assist us by enabling additional debugging features?
> > > This can be done by setting specific variables in your osu-inamd.conf
> > > file, which is located in your INAM installation directory. Please update
> > > the file with the following settings:
> > > 
> > >    INAM_DEBUG_INIT_VERBOSE=1
> > > 
> > >    INAM_DEBUG_TIME_VERBOSE=1
> > > 
> > >    INAM_DEBUG_MAIN_VERBOSE=2
> > > 
> > >    INAM_DEBUG_DB_VERBOSE=2
> > > 
> > >    INAM_DEBUG_NW_VERBOSE=1
> > > 
> > >    INAM_DEBUG_FB_VERBOSE=1
> > > 
> > > 
> > > 
> > >    After updating these settings, kindly restart the INAM daemon to apply
> > > the changes.
> > > 
> > >    Furthermore, we would appreciate it if you could share the log file
> > > segment from /var/log/messages that pertains to INAM, as well as the
> > > updated osu-inam.conf file. This information will greatly aid in our
> > > diagnostics.
> > > 
> > > 
> > > 
> > >    From the srun add the followings as well. You can set
> > > MV2_DEBUG_TOOL_VERBOSE=1 to check if from Mvapich2, we are sending
> > > information to INAM or not. It would be nice if you could send us the
> > > output (will be lengthy) as well. It should be good with 5-10 minutes of
> > > log for the application.
> > > 
> > > 
> > > 
> > > 
> > > 
> > >    Additionally, to assess if there's an interaction issue with Mvapich2,
> > > please add the following environment variables to your srun command:
> > > 
> > > 
> > > 
> > >    export common="MV2_TWO_LEVEL_COMM_THRESHOLD=1 MV2_USE_RDMA_CM=0
> > > MV2_TOOL_REPORT_PVARS=1 MV2_DEBUG_TOOL_VERBOSE=1 MV2_USE_SHMEM_COLL=1
> > > MV2_ENABLE_PVAR_TIMER=1 MV2_ENABLE_PVAR_COUNTER=1
> > > MV2_ENABLE_PVAR_TIMER_BUCKETS=1 MV2_ENABLE_PVAR_COUNTER_BUCKETS=1
> > >  MV2_TOOL_REPORT_SESSIONS=1 MV2_TOOL_SESSIONS_DEFAULT_ALL_HANDLES=1
> > > MV2_TOOL_REPORT_LUSTRE_STATS=0"
> > > 
> > > 
> > > 
> > >    srun … $common \
> > > 
> > >           MV2_DEBUG_TOOL_VERBOSE=1 \
> > > 
> > >           MV2_TWO_LEVEL_COMM_THRESHOLD=1 \
> > > 
> > >           MV2_ON_DEMAND_THRESHOLD=1 \
> > > 
> > >           MV2_USE_RDMA_CM=0 \
> > > 
> > >           MV2_TOOL_INFO_FILE_PATH="<path to osu-inam.conf>" \
> > > 
> > >           application
> > > 
> > > 
> > > 
> > >    Please ensure you replace <path to osu-inam.conf> with the actual path
> > > to your configuration file. The output from this operation might be
> > > extensive, but it's needed for our investigation. You can set
> > > MV2_DEBUG_TOOL_VERBOSE=0 for future runs. This runtime variable is for
> > > debugging purposes only.
> > > 
> > > 
> > > 
> > >    Once you have gathered all the requested information, please send it to
> > > us and we will evaluate and get back to you.
> > > 
> > >    If you have any questions or encounter any difficulties along the way,
> > > do not hesitate to reach out.
> > > 
> > > 
> > > 
> > > 
> > > 
> > >    Best,
> > > 
> > >    Pouya Kousha
> > > 
> > > 
> > > 
> > >    From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> on behalf
> > > of evancervj via Mvapich-discuss <mvapich-discuss at lists.osu.edu>
> > >    Date: Tuesday, February 13, 2024 at 1:41 PM
> > >    To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>,
> > > Lieber, Matt <lieber.31 at osu.edu>
> > >    Subject: Re: [Mvapich-discuss] Issue while Installation of
> > > Mvapich2-x-advanced for OSU-INAM
> > > 
> > >    Hi Matt,   I have installed Slurm package (slurm-23. 11. 3-1. el7.
> > > x86_64) on the system. Next I used srun for running the application(IMB)
> > > using the following command providing the MV2 options as mentioned in the
> > > section 'Running Example'
> > > 
> > > 
> > > 
> > >    Hi Matt,
> > > 
> > > 
> > > 
> > >    I have installed Slurm package (slurm-23.11.3-1.el7.x86_64) on the
> > > system. Next I used srun for running the application(IMB) using the
> > > following command providing the MV2 options as mentioned in the section
> > > 'Running Example' of the userguide:
> > > 
> > > 
> > > 
> > >             srun --mpi=pmi2 -n16 \
> > > 
> > >             --export=MV2_ON_DEMAND_THRESHOLD=1,\
> > > 
> > >             MV2_TOOL_INFO_FILE_PATH=/etc/osu-inam/osu-inam.conf,\
> > > 
> > >             MV2_TWO_LEVEL_COMM_THRESHOLD=1,MV2_USE_RDMA_CM=0,\
> > > 
> > >             MV2_TOOL_REPORT_PVARS=1,MV2_ENABLE_PVAR_TIMER=1,\
> > > 
> > >             MV2_TOOL_REPORT_PVARS=1,MV2_ENABLE_PVAR_TIMER=1,\
> > > 
> > >             MV2_ENABLE_PVAR_COUNTER=1,MV2_ENABLE_PVAR_TIMER_BUCKETS=1,\
> > > 
> > >             MV2_ENABLE_PVAR_COUNTER=1,MV2_ENABLE_PVAR_TIMER_BUCKETS=1,\
> > > 
> > > 
> > >            MV2_TOOL_SESSIONS_DEFAULT_ALL_HANDLES=1,MV2_TOOL_REPORT_LUSTRE_STATS=1
> > > \
> > > 
> > >            ./IMB-MPI1
> > > 
> > > 
> > >    In addition to the packet counter information earlier availbale, this
> > > time Job information like Job-ID is visible and is being updated on the
> > > OSU_INAM web interface.
> > > 
> > >    However, MPI information is not exported. The fields that are not
> > > exported like cpu and memory usage for the Jobs show the message "no data
> > > from mpi process to this job". MPI related graphs like Global or
> > > inter-node communication graph also show the message "no data to display
> > > ".
> > > 
> > >    Is there something I'm missing for process level and MPI-level
> > > information to be exported. How can I get MPI-level information to be seen
> > > on the OSU-INAM interface?
> > > 
> > > 
> > > 
> > >    Thanks
> > > 
> > >    John
> > > 
> > > 
> > >    On January 24, 2024 at 9:45 PM "Lieber, Matt" <lieber.31 at osu.edu>
> > > wrote:
> > > 
> > >     > > > > 
> > > >     Hi John,
> > > > 
> > > >     First you will need Slurm.  Second the rpms that have slurm in their
> > > > name for mvapich2x will use srun as their job launcher and not
> > > > mpirun_rsh.  Also, our user guide under section 5 has some other useful
> > > > information that will be required for the information to show up
> > > > https://mvapich.cse.ohio-state.edu/userguide/osu-inam/#_running_example
> > > > <https://mvapich.cse.ohio-state.edu/userguide/osu-inam/#_running_example>
> > > >  .
> > > > 
> > > > 
> > > > 
> > > >     -Matt
> > > > 
> > > > 
> > > > 
> > > > 
> > > >    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > > > 
> > > >     From: evancervj <evancervj at cdac.in>
> > > >     Sent: Wednesday, January 24, 2024 1:55 AM
> > > >     To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>;
> > > > Lieber, Matt <lieber.31 at osu.edu>
> > > >     Subject: Re: [Mvapich-discuss] Issue while Installation of
> > > > Mvapich2-x-advanced for OSU-INAM
> > > > 
> > > > 
> > > > 
> > > >     HI Matt,   Thanks for the suggestions. I had tried MVAPICH2X-Basic
> > > > as well as using the flag --nodeps for MVAPICH2-X-Advanced. The
> > > > installation is successful. I was able to see the fabric related
> > > > information like network topology and the
> > > > 
> > > > 
> > > > 
> > > >     HI Matt,
> > > > 
> > > > 
> > > > 
> > > >     Thanks for the suggestions. I had tried MVAPICH2X-Basic as well as
> > > > using the flag --nodeps for MVAPICH2-X-Advanced. The installation is
> > > > successful. I was able to see the fabric related information like
> > > > network topology and the run-time packet counter information on the INAM
> > > > web interface.
> > > > 
> > > > 
> > > > 
> > > >     However process level and MPI related information was not seen on
> > > > the INAM web interface. To export MPI related information to OSU_INAM, I
> > > > tried running application using MVAPICH2X, but the mpirun_rsh file was
> > > > missing in the MVAPICH2X installation directory.
> > > > 
> > > >     Both the MVAPICH2X packages(basic and advanced) did not provide the
> > > > files like mpirun_rsh, mpirun.., to run applications.
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > >     rpm -Uvh --nodeps
> > > > mvapich2-x-advanced-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64.rpm
> > > >     Preparing...
> > > >                          ################################# [100%]
> > > >     Updating / installing...
> > > > 
> > > >       1:mvapich2-x-advanced-mofed5.0-gnu4#################################
> > > > [100%]
> > > > 
> > > > 
> > > > 
> > > >     ll /opt/mvapich2-x/gnu4.8.5/mofed5.0/advanced/slurm/bin/
> > > >     total 132
> > > >     lrwxrwxrwx 1 root root     6 Jan 23 10:27 mpic++ -> mpicxx
> > > >     -rwxr-xr-x 1 root root 10970 Jan 23 10:27 mpicc
> > > >     -rwxr-xr-x 1 root root 12856 Jun  2  2021 mpichversion
> > > >     -rwxr-xr-x 1 root root 10503 Jan 23 10:27 mpicxx
> > > >     -rwxr-xr-x 1 root root 14218 Jan 23 10:27 mpif77
> > > >     -rwxr-xr-x 1 root root 14218 Jan 23 10:27 mpif90
> > > >     -rwxr-xr-x 1 root root 14278 Jun  2  2021 mpifort
> > > >     -rwxr-xr-x 1 root root 12928 Jun  2  2021 mpiname
> > > >     -rwxr-xr-x 1 root root 23256 Jun  2  2021 mpivars
> > > >     -rwxr-xr-x 1 root root  3430 Jun  2  2021 parkill
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > >     rpm -Uvh
> > > > mvapich2-x-basic-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64.rpm
> > > >     Preparing...
> > > >                          ################################# [100%]
> > > >     Updating / installing...
> > > > 
> > > >       1:mvapich2-x-basic-mofed5.0-gnu4.8.#################################
> > > > [100%]
> > > > 
> > > >     ll /opt/mvapich2-x/gnu4.8.5/mofed5.0/basic/slurm/bin/
> > > >     total 132
> > > >     lrwxrwxrwx 1 root root     6 Jan 24 11:54 mpic++ -> mpicxx
> > > >     -rwxr-xr-x 1 root root 10756 Jan 24 11:54 mpicc
> > > >     -rwxr-xr-x 1 root root 12856 May 20  2021 mpichversion
> > > >     -rwxr-xr-x 1 root root 10324 Jan 24 11:54 mpicxx
> > > >     -rwxr-xr-x 1 root root 14036 Jan 24 11:54 mpif77
> > > >     -rwxr-xr-x 1 root root 14036 Jan 24 11:54 mpif90
> > > >     -rwxr-xr-x 1 root root 14096 May 20  2021 mpifort
> > > >     -rwxr-xr-x 1 root root 12928 May 20  2021 mpiname
> > > >     -rwxr-xr-x 1 root root 23256 May 20  2021 mpivars
> > > >     -rwxr-xr-x 1 root root  3430 May 20  2021 parkill
> > > > 
> > > > 
> > > > 
> > > >     I was able to access the fabric information on the OSU-INAM
> > > > interface. I wanted to leverage the OSU-INAM features to observe the
> > > > process level and MPI level information as well, but failed because of
> > > > the above mentioned issue. Any pointers on this will be helpful.
> > > > 
> > > > 
> > > > 
> > > >     Also my current environment do not have Slurm or any other job
> > > > scheduler installed. So Is Slurm necessary for OSU-INAM to export MPI
> > > > level information?
> > > > 
> > > > 
> > > > 
> > > >     Thanks
> > > > 
> > > >     John
> > > > 
> > > > 
> > > > 
> > > > 
> > > >     On January 20, 2024 at 6:36 AM "Lieber, Matt" <lieber.31 at osu.edu>
> > > > wrote:
> > > > 
> > > >      > > > > > 
> > > > >      Hi John,
> > > > > 
> > > > >      Sorry for the delay in getting back to you.  There are multiple
> > > > > options that could fix this.  If you do not plan to use the sharp
> > > > > functionality adding the flag --nodeps should fix the issue you are
> > > > > seeing. Also MVAPICH2-X basic should also work with INAM.  If you do
> > > > > wish to use sharp we will have to send a new rpm.  Please let let us
> > > > > know how these options work for you.
> > > > > 
> > > > > 
> > > > > 
> > > > >      Thanks,
> > > > > 
> > > > >      Matt
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > >     ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > > > > 
> > > > >      From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> on
> > > > > behalf of V John via Mvapich-discuss <mvapich-discuss at lists.osu.edu>
> > > > >      Sent: Tuesday, January 16, 2024 7:41 AM
> > > > >      To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>
> > > > >      Subject: [Mvapich-discuss] Issue while Installation of
> > > > > Mvapich2-x-advanced for OSU-INAM
> > > > > 
> > > > > 
> > > > > 
> > > > >      Hi everyone, I'm trying to use OSU-INAM on our cluster with
> > > > > Mellanox-IB ConnectX-6 interconnect with machines having Centos7. 6
> > > > > OS. I downloaded the osu-inam(osu-inam-mysql-1. 0-1. el7. x86_64. rpm)
> > > > > package matching the environment which also required
> > > > > 
> > > > > 
> > > > > 
> > > > >      Hi everyone,
> > > > > 
> > > > > 
> > > > > 
> > > > >      I'm trying to use OSU-INAM on our cluster with Mellanox-IB
> > > > > ConnectX-6 interconnect with machines having Centos7.6 OS. I
> > > > > downloaded the osu-inam(osu-inam-mysql-1.0-1.el7.x86_64.rpm) package
> > > > > matching the environment which also required MOFED 5+. So I installed
> > > > > MLNX_OFED_LINUX-5.0-1.0.0.0 package on the systems.
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > >      Osu-Inam user-guide mentioned the requirement of
> > > > > Mvapich2-x-advanced for getting MPI usage related information. Hence,
> > > > > I tried to install MVAPICH2-X-advanced. The only binary matching the
> > > > > environment available on the Mvapich website is
> > > > > mvapich2-x-advanced-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64.rpm. The
> > > > > installation fails however due to dependency on libsharp_coll.so.4.
> > > > > 
> > > > > 
> > > > > 
> > > > >      rpm -Uvh
> > > > >  mvapich2-x-advanced-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64.rpm
> > > > >      error: Failed dependencies:
> > > > >      libsharp_coll.so.4()(64bit) is needed by
> > > > > mvapich2-x-advanced-mofed5.0-gnu4.8.5-slurm-2.3-1.el7.x86_64
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > >      The libsharp_coll library versions under the
> > > > > /opt/mellanox/sharp/lib directory created post the installation of
> > > > > MLNX_OFED_LINUX-5.0-1.0.0.0 misses the required version,
> > > > > libsharp_coll.so.4.
> > > > > 
> > > > > 
> > > > > 
> > > > >      ls /opt/mellanox/sharp/lib | grep libsharp_coll.so
> > > > >      libsharp_coll.so
> > > > >      libsharp_coll.so.5
> > > > >      libsharp_coll.so.5.0.1
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > >      Is there a fix to this issue..?
> > > > > 
> > > > > 
> > > > > 
> > > > >      Thanks
> > > > > 
> > > > >      John
> > > > > 
> > > > >      HPC Technologies Group
> > > > > 
> > > > >      C-DAC pune
> > > > > 
> > > > > 
> > > > > 
> > > > >     ------------------------------------------------------------------------------------------------------------
> > > > >      [ C-DAC is on Social-Media too. Kindly follow us at:
> > > > >      Facebook: https://urldefense.com/v3/__https://www.facebook.com/CDACINDIA__;!!KGKeukY!3y68wPgmnnjeXgQC8FuAGn2AY7xYMAFGiTuTj1cQWQfhYKGUTRi5IXEuQyekk_N5jHQRoRp4yXN4xR7rO5pgKphltbSC0A$ 
> > > > > <https://urldefense.com/v3/__https:/www.facebook.com/CDACINDIA__;!!KGKeukY!2KJphDWGFEcrgBZNFD0pyWfL_tLCH64l43moQyPTHdZMlRXL2sAhibE3ALe1Uzxh39TVZU1r-xrOX_sKpGycM_VAyzT0TQ$>
> > > > > & Twitter: @cdacindia ]
> > > > > 
> > > > >      This e-mail is for the sole use of the intended recipient(s) and
> > > > > may
> > > > >      contain confidential and privileged information. If you are not
> > > > > the
> > > > >      intended recipient, please contact the sender by reply e-mail and
> > > > > destroy
> > > > >      all copies and the original message. Any unauthorized review,
> > > > > use,
> > > > >      disclosure, dissemination, forwarding, printing or copying of
> > > > > this email
> > > > >      is strictly prohibited and appropriate legal action will be
> > > > > taken.
> > > > > 
> > > > >     ------------------------------------------------------------------------------------------------------------
> > > > > 
> > > > >     > > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > >    ------------------------------------------------------------------------------------------------------------
> > > >     [ C-DAC is on Social-Media too. Kindly follow us at:
> > > >     Facebook: https://urldefense.com/v3/__https://www.facebook.com/CDACINDIA__;!!KGKeukY!3y68wPgmnnjeXgQC8FuAGn2AY7xYMAFGiTuTj1cQWQfhYKGUTRi5IXEuQyekk_N5jHQRoRp4yXN4xR7rO5pgKphltbSC0A$ 
> > > > <https://urldefense.com/v3/__https:/www.facebook.com/CDACINDIA__;!!KGKeukY!xP_sp9xIt5MlF-mGh9nukFmdaGDl-o3tQ8iYqzLQpVIRRub9-TeUbJ4uW-GqCJC54Y1hZ38BrLaABtnLmSWZ$>
> > > > & Twitter: @cdacindia ]
> > > > 
> > > >     This e-mail is for the sole use of the intended recipient(s) and may
> > > >     contain confidential and privileged information. If you are not the
> > > >     intended recipient, please contact the sender by reply e-mail and
> > > > destroy
> > > >     all copies and the original message. Any unauthorized review, use,
> > > >     disclosure, dissemination, forwarding, printing or copying of this
> > > > email
> > > >     is strictly prohibited and appropriate legal action will be taken.
> > > > 
> > > >    ------------------------------------------------------------------------------------------------------------
> > > > 
> > > >    > > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > >   ------------------------------------------------------------------------------------------------------------
> > >    [ C-DAC is on Social-Media too. Kindly follow us at:
> > >    Facebook: https://urldefense.com/v3/__https://www.facebook.com/CDACINDIA__;!!KGKeukY!3y68wPgmnnjeXgQC8FuAGn2AY7xYMAFGiTuTj1cQWQfhYKGUTRi5IXEuQyekk_N5jHQRoRp4yXN4xR7rO5pgKphltbSC0A$ 
> > > <https://urldefense.com/v3/__https:/www.facebook.com/CDACINDIA__;!!KGKeukY!xDnLo72OA_j_Ed4dReyJgNnnLbKCQdAh6avvk7JI6XHItqFiW5Z46E9g9a0A4WphLH9i1qstvTDycmA65H5Qj0GtzVFwXQ$>
> > > & Twitter: @cdacindia ]
> > > 
> > >    This e-mail is for the sole use of the intended recipient(s) and may
> > >    contain confidential and privileged information. If you are not the
> > >    intended recipient, please contact the sender by reply e-mail and
> > > destroy
> > >    all copies and the original message. Any unauthorized review, use,
> > >    disclosure, dissemination, forwarding, printing or copying of this
> > > email
> > >    is strictly prohibited and appropriate legal action will be taken.
> > > 
> > >   ------------------------------------------------------------------------------------------------------------
> > > 
> > >   > > 
> > 
> > 
> > 
> >  > 
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------------------------------------------
>  [ C-DAC is on Social-Media too. Kindly follow us at:
>  Facebook: https://urldefense.com/v3/__https://www.facebook.com/CDACINDIA__;!!KGKeukY!3y68wPgmnnjeXgQC8FuAGn2AY7xYMAFGiTuTj1cQWQfhYKGUTRi5IXEuQyekk_N5jHQRoRp4yXN4xR7rO5pgKphltbSC0A$ 
> <https://urldefense.com/v3/__https:/www.facebook.com/CDACINDIA__;!!KGKeukY!0HCbYervR8slKdz0w6qcJcDj9LfEhykuN9AHiSOxKc5Clsi7OWmIEOPAIu1QXtEhJIfrYGHXV0EYc91Fdy8ljeZ6ND1MGA$>
> & Twitter: @cdacindia ]
> 
>  This e-mail is for the sole use of the intended recipient(s) and may
>  contain confidential and privileged information. If you are not the
>  intended recipient, please contact the sender by reply e-mail and destroy
>  all copies and the original message. Any unauthorized review, use,
>  disclosure, dissemination, forwarding, printing or copying of this email
>  is strictly prohibited and appropriate legal action will be taken.
> 
> ------------------------------------------------------------------------------------------------------------
> 

------------------------------------------------------------------------------------------------------------
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://urldefense.com/v3/__https://www.facebook.com/CDACINDIA__;!!KGKeukY!3y68wPgmnnjeXgQC8FuAGn2AY7xYMAFGiTuTj1cQWQfhYKGUTRi5IXEuQyekk_N5jHQRoRp4yXN4xR7rO5pgKphltbSC0A$  & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
------------------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20240412/d561a821/attachment-0002.html>


More information about the Mvapich-discuss mailing list