[mvapich-discuss] MVAPICH2 with SLURM PMI
Adam Moody
moody20 at llnl.gov
Thu Mar 6 17:45:59 EST 2008
Hi John,
Thanks to you and everyone for their suggestions (Pavan and Joshua
Bernstein, also). It'd be worth updating the MVAPICH user guide once we
figure this out.
Option #1 ==============================
I ended up having to add "-lslurm" to the mix to get this to work, i.e.,:
export LDFLAGS="-L/usr/lib64 -lpmi -lslurm"
Just using "-L/usr/lib64 -lpmi" by itself died in the configure when
trying to build a simple C file as a number of symbols where undefined,
such as:
slurm_free_kvs_comm_set
slurm_get_kvs_comm_set
I thought at one point last summer, just using "-lpmi" was sufficient,
but maybe the symbols moved in a later version of SLURM. We are using
1.2.24. Anyway, this now works.
======================================
Option #2 ==============================
Leaving LDFLAGS alone and adding "-L/usr/lib64 -lpmi -lslurm" to my
mpicc line also works.
======================================
However, with these solutions, I'm a bit fuzzy on how exactly the
linking is done. If you build with the standard make.mvapich2.ofa,
which uses "--with-pm=mpd", this builds the "simple" PMI and adds its
object file to the libmpich.so. And then when you also link against the
SLURM PMI via LDFLAGS (or on the mpicc line) you end up with *two*
definitions for the PMI functions. For example, you can see that my
binary has access to two definitions for PMI_Get_size():
>>: ldd ./mpiBench
libpmi.so.0 => /usr/lib64/libpmi.so.0 (0x00002aaaaacc6000)
libslurm.so.11 => /usr/lib64/libslurm.so.11 (0x00002aaaaaecb000)
libmpich.so =>
/usr/global/tools/mpi/chaos_3_x86_64_ib/mvapich2-1.0-osu_r2085/gnu/lib/libmpich.so
(0x00002aaaab13c000)
>>: nm /usr/lib64/libpmi.so.0 | grep PMI_Get_size
00000000000034a0 T PMI_Get_size
>>: nm
/usr/global/tools/mpi/chaos_3_x86_64_ib/mvapich2-1.0-osu_r2085/gnu/lib/libmpich.so
| grep PMI_Get_size
00000000000fbae0 T PMI_Get_size
This mixing makes me a bit nervous, as I don't know how it determines
which one to call. It apparently does call the SLURM version, since
srun can now launch the job. Hopefully, it doesn't manage to mix calls
between the two implementations. As I was waiting for feedback, I
started investigating how to disable the "simple" PMI from being
included in the libmpich library and came up with the following hack:
Option #3 ==============================
I copied the "src/pm/mpd" directory into a "src/pm/slurm" and then
modified the "mpich2prereq" file to look like the following:
>>: cat mpich2prereq
#! /bin/sh
# We'll be using SLURM to launch processes, no need for a process
manager or PMI implementation in MPI
MPID_NO_PM=yes
MPID_NO_PMI=yes
# (Selecting multiple PM's may require incompatible PMI implementations
# (e.g., MPD and SMPD).
if [ -z "$PM_REQUIRES_PMI" ] ; then
echo "NO PMI REQUIRED FOR SLURM"
fi
Then, I changed the "--with-pm=mpd" to "--with-pm=slurm" in the
make.mvapich2.ofa file. Finally, I added "-L/usr/lib64 -lpmi -lslurm"
to LDFLAGS to pick up the SLURM PMI. With this, I was able to build the
MPI without linking in the simple PMI. The binary has access to one and
only one set of PMI definitions in this case.
======================================
From Pavan's email, it sounds like maybe the latest mpich2 does
something along these lines?
-Adam
Jonathan Perkins wrote:
> Adam Moody wrote:
>
>> Hi,
>> I'm trying to build MVAPICH-2 to use SLURM's PMI, which can be linked
>> in via something like "-L/usr/lib64 -lpmi". Anyone know what I have
>> to do during the configure / make to get this to happen?
>>
>> I'm using the make.mvapich.ofa script, but it keeps selecting it's
>> mpd / simple pmi implementation. The various things I've tried to
>> set either blow up during the configure or build with the simple pmi.
>> Thanks,
>> -Adam
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
> Adam:
> Hi, it sounds like you want mpicc to automatically include the slurm
> pmi library linking options as you specified.
>
> If this is **not** the case then you can simply build using
> make.mvapich2.ofa in the normal way and then build your mpi
> application using something like
>
> mpicc -L/usr/lib64 -lpmi mpiapp.c -o mpiapp
>
> If you would like mpicc to do this step for you, you should be able to
> export the MPI_LDFLAGS in the make.mvapich2.ofa script before
> configure is invoked. Please let us know if this works for you.
>
More information about the mvapich-discuss
mailing list