[Mvapich-discuss] MVAPICH2 2.3.7-1 potentially incorrect dependency on libibmad.so / building for PMIx and PMI2 both?

Ryan Novosielski novosirj at rutgers.edu
Tue Nov 21 23:31:45 EST 2023


Hi there,

A few questions:

1. Building MPAPICH2 2.3.7-1 on CentOS 7.9 with the inbox drivers, and I’m running into a problem where if libibmad-devel/infiniband-diags-devel is installed, MVAPICH2 then won’t run properly on compute nodes that do not have this package (instead having both libibmad.so<https://urldefense.com/v3/__http://libibmad.so__;!!KGKeukY!2mUVRORgh0PpVCStethBPpFJL_oW6WC1uM5uqph8wz5n3gpRQkXn9HnE2I6DGOGnoH1fG4YKCGjVlW-HHThlF-19egjC2AM$ >.5.5.0 and libibmad.so<https://urldefense.com/v3/__http://libibmad.so__;!!KGKeukY!2mUVRORgh0PpVCStethBPpFJL_oW6WC1uM5uqph8wz5n3gpRQkXn9HnE2I6DGOGnoH1fG4YKCGjVlW-HHThlF-19egjC2AM$ >.5, provided by libibmad/infiniband-diags). As far as I can tell, this is new behavior, and is probably not intentional. If I remove that package, I get a build that does not have that dependency, but it’s not entirely clear to me whether it’s using Infiniband/RDMA at that point. In any case, What can be done about it?

2. My build is the following:

../mvapich2-2.3.7-1/configure --with-hwloc=2 --with-pmi=pmi2 --with-pmi=pmix --with-pmix=/opt/ohpc/admin/pmix --with-pm=slurm --prefix=/opt/sw/packages/gcc-4_8/mvapich2/2.3.7-1

As far as I can tell, --with-pmi=pmi2 is getting ignored here, and if I try srun --mpi=pmi2, I get a segfault (mix/pmix_v4 works). Is there actually a way to build a copy that will work with either PMI2 or PMIx? It will be helpful for nudging users over the fence.

3. Also, I have both hwloc 1.11.8 and 2.9.0 installed. Not clear to me if it’s clearly better to use one or the other. I imagine the newer is preferred, but at our site, it’s loaded via a module, so a little less convenient.

Thanks!

--
#BlackLivesMatter
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB A555B, Newark
     `'

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20231122/9724ee11/attachment.html>


More information about the Mvapich-discuss mailing list