[Mvapich-discuss] mvapich 3.0b srun start failure

christof.koehler at bccms.uni-bremen.de christof.koehler at bccms.uni-bremen.de
Sat May 20 09:50:04 EDT 2023


!-------------------------------------------------------------------|
  This Message Is From an External Sender
  This message came from outside your organization.
|-------------------------------------------------------------------!

Hello Nat,

thank you very much for the prompt patch and for clearing up
some of my confusion regarding PMIx. We have been using maui/torque
before, so I had no direct contact with PMIx up to now.

I can confirm that now a mvapich 3.0b built with 
FFLAGS=-fallow-argument-mismatch ./configure --with-pm=slurm
--with-pmi=pmi1 --with-device=ch4:ofi
runs successfully when launched with srun --mpi=pmi2 . I tried to check
process binding which also appears to be fine.

Best Regards

Christof

> 
> On Fri, May 19, 2023 at 12:04:27PM +0000, Shineman, Nat wrote:
> > Hi Christof,
> > 
> > To provide a little clarity here, that should not affect us. There are 3 standards for PMI: PMI1, PMI2, and PMIx. PMI1 is well standardized and well established. PMI2 was pushed for largely by Slurm but was never fully implemented as a standard. Therefore, each flavor of PMI2 works a little differently. For this reason, we no longer recommend it since it often breaks and is not as stable as the PMI1 interface in most instances. PMIx is a newer standardized PMI interface that is written and maintained by the PMIx Forum in a similar fashion to the MPI forum. It has seen some adoption but requires an external library and is the one undergoing active development. This is the version 1.X they are referring to.
> > 
> > The issue here is that we made some adjustments to the fallback path to support launchers other than hydra and slurm and there appears to be a bug in that path. I have attached a patch for that. In my testing that resolves issues on the fallback path. Please let me know if you still have issues.
> > 
> > Thanks,
> > Nat
> > ________________________________
> > From: christof.koehler at bccms.uni-bremen.de <christof.koehler at bccms.uni-bremen.de>
> > Sent: Friday, May 19, 2023 02:31
> > To: Shineman, Nat <shineman.5 at osu.edu>
> > Cc: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>
> > Subject: Re: [Mvapich-discuss] mvapich 3.0b srun start failure
> > 
> > Hello Nat,
> > 
> > there might have been a mess up on our side.
> > 
> > We upgraded in between to slurm 23.02.2 for unrelated reasons without
> > thinking too much about it, the cluster is not yet in regular use.
> > However, the release notes contain:
> > "NOTE: PMIx v1.x is no longer supported."
> > 
> > I do not understand the version naming of PMI(x), so this might be
> > related or unrelated to the problem I observed. The slurm-libpmi rpm
> > we built and installed however still contains
> > /usr/lib64/libpmi.so
> > /usr/lib64/libpmi.so.0
> > /usr/lib64/libpmi.so.0.0.0
> > which I believe is pmi1 as needed by mvapich.
> > 
> > 
> > Best Regards
> > 
> > Christof
> > 
> > On Thu, May 18, 2023 at 01:46:35PM +0000, Shineman, Nat wrote:
> > > Hi Cristof,
> > >
> > > Thanks for reporting this. It looks like what is happening is srun is unable to get your process mapping from the slurm daemon and is doing a fallback method. We've overridden that fallback to support other launchers with PMI1 support and it looks like we did not provide the correct safeties to ensure it still worked with slrum. I should be able to provide you with a patch shortly. In the meantime, yes you can try building with hydra and/or mpirun_rsh by removing the slurm arguments. Both of those launchers have some degree of integration with slurm.
> > >
> > > Thanks,
> > > Nat
> > > ________________________________
> > > From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> on behalf of christof.koehler--- via Mvapich-discuss <mvapich-discuss at lists.osu.edu>
> > > Sent: Thursday, May 18, 2023 07:47
> > > To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>
> > > Subject: [Mvapich-discuss] mvapich 3.0b srun start failure
> > >
> > > Hello everybody,
> > >
> > > I now started to test the mvapich 3.0b build. It was compiled on Rocky
> > > Linux 9.1 with slurm 23.02.2 and gcc 11.3.1. See at the end of email for
> > > mpichversion output.
> > >
> > > When I try to start a simple mpi hello world with srun --mpi=pmi2
> > > I see an error messages concerning PMI and a segfault, see also at the
> > > end of the email. The same mpi hello world source code using the same
> > > srun --mpi=pmi2 invocation (but obviously different binaries) works fine
> > > with mvapich2 2.3.7, mpich 4.1.1 and openmpi 4.1.5.
> > >
> > > Should I try another launcher, e.g. hydra by not setting --with-pm and
> > > --with-pmi ? Would the hydra launcher be able to communicate wth slurm,
> > > though ?
> > >
> > > Best Regards
> > >
> > > Christof
> > >
> > > $ mpichversion
> > > MVAPICH Version:        3.0b
> > > MVAPICH Release date:   04/10/2023
> > > MVAPICH Device:         ch4:ofi
> > > MVAPICH configure:      --with-pm=slurm --with-pmi=pmi1
> > > --with-device=ch4:ofi --prefix=/cluster/mpi/mvapich2/3.0a/gcc11.3.1
> > > MVAPICH CC:     gcc    -DNDEBUG -DNVALGRIND -O2
> > > MVAPICH CXX:    g++   -DNDEBUG -DNVALGRIND -O2
> > > MVAPICH F77:    gfortran -fallow-argument-mismatch  -O2
> > > MVAPICH FC:     gfortran   -O2
> > > MVAPICH Custom Information:     @MVAPICH_CUSTOM_STRING@
> > >
> > > Error Message:
> > >
> > > INTERNAL ERROR: invalid error code 6163 (Ring ids do not match) in
> > > MPIR_NODEMAP_build_nodemap_fallback:355
> > > Abort(2141455) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init:
> > > Other MPI error, error stack:
> > > MPIR_Init_thread(175)...................:
> > > MPID_Init(509)..........................:
> > > MPIR_pmi_init(119)......................:
> > > build_nodemap(882)......................:
> > > MPIR_NODEMAP_build_nodemap_fallback(355):
> > > In: PMI_Abort(2141455, Fatal error in PMPI_Init: Other MPI error, error
> > > stack:
> > > MPIR_Init_thread(175)...................:
> > > MPID_Init(509)..........................:
> > > MPIR_pmi_init(119)......................:
> > > build_nodemap(882)......................:
> > > MPIR_NODEMAP_build_nodemap_fallback(355): )
> > > INTERNAL ERROR: invalid error code 6106 (Ring ids do not match) in
> > > MPIR_NODEMAP_build_nodemap_fallback:355
> > > Abort(2141455) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init:
> > > Other MPI error, error stack:
> > > MPIR_Init_thread(175)...................:
> > > MPID_Init(509)..........................:
> > > MPIR_pmi_init(119)......................:
> > > build_nodemap(882)......................:
> > > MPIR_NODEMAP_build_nodemap_fallback(355):
> > > In: PMI_Abort(2141455, Fatal error in PMPI_Init: Other MPI error, error
> > > stack:
> > > MPIR_Init_thread(175)...................:
> > > MPID_Init(509)..........................:
> > > MPIR_pmi_init(119)......................:
> > > build_nodemap(882)......................:
> > > MPIR_NODEMAP_build_nodemap_fallback(355): )
> > > srun: error: gpu001: tasks 0-9: Segmentation fault (core dumped)
> > >
> > >
> > >
> > >
> > > --
> > > Dr. rer. nat. Christof Köhler       email: c.koehler at uni-bremen.de
> > > Universitaet Bremen/FB1/BCCMS       phone:  +49-(0)421-218-62334
> > > Am Fallturm 1/ TAB/ Raum 3.06       fax: +49-(0)421-218-62770
> > > 28359 Bremen
> > > _______________________________________________
> > > Mvapich-discuss mailing list
> > > Mvapich-discuss at lists.osu.edu
> > > https://lists.osu.edu/mailman/listinfo/mvapich-discuss
> > 
> > --
> > Dr. rer. nat. Christof Köhler       email: c.koehler at uni-bremen.de
> > Universitaet Bremen/FB1/BCCMS       phone:  +49-(0)421-218-62334
> > Am Fallturm 1/ TAB/ Raum 3.06       fax: +49-(0)421-218-62770
> > 28359 Bremen
> 
> > commit 5eca537b0db769ecd235adea7e6351928476b078
> > Author: Nat Shineman <shineman.5 at osu.edu>
> > Date:   Fri May 19 06:54:57 2023 -0500
> > 
> >     Fix slurm startup with fallback method
> >     
> >     Should resolve Christof's issues with slurm startup using the fallback
> >     method. Providing this to him as a patch.
> > 
> > diff --git a/src/util/mpir_nodemap.h b/src/util/mpir_nodemap.h
> > index eaf52612b4..a59f8c93d9 100644
> > --- a/src/util/mpir_nodemap.h
> > +++ b/src/util/mpir_nodemap.h
> > @@ -352,8 +352,11 @@ static inline int MPIR_NODEMAP_build_nodemap_fallback(int sz, int myrank, int *o
> >  #ifdef _OSU_MVAPICH_
> >      mpi_errno = MPIR_NODEMAP_MVP_build_nodemap(sz, out_nodemap, out_max_node_id,
> >                                                 myrank);
> > -    MPIR_ERR_CHECK(mpi_errno);
> > -    goto fn_exit;
> > +    if (mpi_errno != MPI_ERR_UNKNOWN) {
> > +        /* an unknown error means we couldn't do an mpirun_rsh setup */
> > +        MPIR_ERR_CHECK(mpi_errno);
> > +        goto fn_exit;
> > +    }
> >  #endif
> >  
> >      for (int i = 0; i < sz; ++i) {
> > diff --git a/src/util/mvp_nodemap.h b/src/util/mvp_nodemap.h
> > index 364d200b4b..39c29547ad 100644
> > --- a/src/util/mvp_nodemap.h
> > +++ b/src/util/mvp_nodemap.h
> > @@ -20,7 +20,7 @@ static inline int MPIR_NODEMAP_MVP_build_nodemap(int sz, int *out_nodemap,
> >                                                   int *out_max_node_id,
> >                                                   int myrank)
> >  {
> > -    int mpi_errno;
> > +    int mpi_errno = MPI_ERR_UNKNOWN;
> >      char *value, *mapping;
> >      int did_map;
> >  
> > @@ -42,6 +42,7 @@ static inline int MPIR_NODEMAP_MVP_build_nodemap(int sz, int *out_nodemap,
> >                  MPIR_ERR_POP(mpi_errno);
> >              }
> >          }
> > +        mpi_errno = MPI_SUCCESS;
> >      }
> >  
> >  fn_exit:
> 
> 

-- 
Dr. rer. nat. Christof Köhler       email: c.koehler at uni-bremen.de
Universitaet Bremen/FB1/BCCMS       phone:  +49-(0)421-218-62334
Am Fallturm 1/ TAB/ Raum 3.06       fax: +49-(0)421-218-62770
28359 Bremen  



More information about the Mvapich-discuss mailing list