[mvapich-discuss] MVAPICH2 with SLURM PMI

Adam Moody moody20 at llnl.gov
Thu Mar 6 17:45:59 EST 2008


Hi John,
Thanks to you and everyone for their suggestions (Pavan and Joshua 
Bernstein, also).  It'd be worth updating the MVAPICH user guide once we 
figure this out.

Option #1 ==============================
I ended up having to add "-lslurm" to the mix to get this to work, i.e.,:
    export LDFLAGS="-L/usr/lib64 -lpmi -lslurm"
Just using "-L/usr/lib64 -lpmi" by itself died in the configure when 
trying to build a simple C file as a number of symbols where undefined, 
such as:
    slurm_free_kvs_comm_set
    slurm_get_kvs_comm_set
I thought at one point last summer, just using "-lpmi" was sufficient, 
but maybe the symbols moved in a later version of SLURM.  We are using 
1.2.24.  Anyway, this now works.
======================================

Option #2 ==============================
Leaving LDFLAGS alone and adding "-L/usr/lib64 -lpmi -lslurm" to my 
mpicc line also works.
======================================

However, with these solutions, I'm a bit fuzzy on how exactly the 
linking is done.  If you build with the standard make.mvapich2.ofa, 
which uses "--with-pm=mpd", this builds the "simple" PMI and adds its 
object file to the libmpich.so.  And then when you also link against the 
SLURM PMI via LDFLAGS (or on the mpicc line) you end up with *two* 
definitions for the PMI functions.  For example, you can see that my 
binary has access to two definitions for PMI_Get_size():

    >>: ldd ./mpiBench
            libpmi.so.0 => /usr/lib64/libpmi.so.0 (0x00002aaaaacc6000)
            libslurm.so.11 => /usr/lib64/libslurm.so.11 (0x00002aaaaaecb000)
            libmpich.so => 
/usr/global/tools/mpi/chaos_3_x86_64_ib/mvapich2-1.0-osu_r2085/gnu/lib/libmpich.so 
(0x00002aaaab13c000)

    >>: nm /usr/lib64/libpmi.so.0 | grep PMI_Get_size
    00000000000034a0 T PMI_Get_size

    >>: nm 
/usr/global/tools/mpi/chaos_3_x86_64_ib/mvapich2-1.0-osu_r2085/gnu/lib/libmpich.so 
| grep PMI_Get_size
    00000000000fbae0 T PMI_Get_size

This mixing makes me a bit nervous, as I don't know how it determines 
which one to call.  It apparently does call the SLURM version, since 
srun can now launch the job.  Hopefully, it doesn't manage to mix calls 
between the two implementations.  As I was waiting for feedback, I 
started investigating how to disable the "simple" PMI from being 
included in the libmpich library and came up with the following hack:

Option #3 ==============================
I copied the "src/pm/mpd" directory into a "src/pm/slurm" and then 
modified the "mpich2prereq" file to look like the following:
    >>:  cat mpich2prereq
    #! /bin/sh
    # We'll be using SLURM to launch processes, no need for a process 
manager or PMI implementation in MPI
    MPID_NO_PM=yes
    MPID_NO_PMI=yes
    # (Selecting multiple PM's may require incompatible PMI implementations
    # (e.g., MPD and SMPD).
    if [ -z "$PM_REQUIRES_PMI" ] ; then
         echo "NO PMI REQUIRED FOR SLURM"
    fi
Then, I changed the "--with-pm=mpd" to "--with-pm=slurm" in the 
make.mvapich2.ofa file.  Finally, I added "-L/usr/lib64 -lpmi -lslurm" 
to LDFLAGS to pick up the SLURM PMI.  With this, I was able to build the 
MPI without linking in the simple PMI.  The binary has access to one and 
only one set of PMI definitions in this case.
======================================

 From Pavan's email, it sounds like maybe the latest mpich2 does 
something along these lines?
-Adam


Jonathan Perkins wrote:

> Adam Moody wrote:
>
>> Hi,
>> I'm trying to build MVAPICH-2 to use SLURM's PMI, which can be linked 
>> in via something like "-L/usr/lib64 -lpmi".  Anyone know what I have 
>> to do during the configure / make to get this to happen?
>>
>> I'm using the make.mvapich.ofa script, but it keeps selecting it's 
>> mpd / simple pmi implementation.  The various things I've tried to 
>> set either blow up during the configure or build with the simple pmi.
>> Thanks,
>> -Adam
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
> Adam:
> Hi, it sounds like you want mpicc to automatically include the slurm 
> pmi library linking options as you specified.
>
> If this is **not** the case then you can simply build using 
> make.mvapich2.ofa in the normal way and then build your mpi 
> application using something like
>
> mpicc -L/usr/lib64 -lpmi mpiapp.c -o mpiapp
>
> If you would like mpicc to do this step for you, you should be able to 
> export the MPI_LDFLAGS in the make.mvapich2.ofa script before 
> configure is invoked.  Please let us know if this works for you.
>


More information about the mvapich-discuss mailing list