[Mvapich-discuss] MVAPICH2 2.3.7-1 potentially incorrect dependency on libibmad.so / building for PMIx and PMI2 both?
Ryan Novosielski
novosirj at rutgers.edu
Tue Nov 21 23:31:45 EST 2023
Hi there,
A few questions:
1. Building MPAPICH2 2.3.7-1 on CentOS 7.9 with the inbox drivers, and I’m running into a problem where if libibmad-devel/infiniband-diags-devel is installed, MVAPICH2 then won’t run properly on compute nodes that do not have this package (instead having both libibmad.so<https://urldefense.com/v3/__http://libibmad.so__;!!KGKeukY!2mUVRORgh0PpVCStethBPpFJL_oW6WC1uM5uqph8wz5n3gpRQkXn9HnE2I6DGOGnoH1fG4YKCGjVlW-HHThlF-19egjC2AM$ >.5.5.0 and libibmad.so<https://urldefense.com/v3/__http://libibmad.so__;!!KGKeukY!2mUVRORgh0PpVCStethBPpFJL_oW6WC1uM5uqph8wz5n3gpRQkXn9HnE2I6DGOGnoH1fG4YKCGjVlW-HHThlF-19egjC2AM$ >.5, provided by libibmad/infiniband-diags). As far as I can tell, this is new behavior, and is probably not intentional. If I remove that package, I get a build that does not have that dependency, but it’s not entirely clear to me whether it’s using Infiniband/RDMA at that point. In any case, What can be done about it?
2. My build is the following:
../mvapich2-2.3.7-1/configure --with-hwloc=2 --with-pmi=pmi2 --with-pmi=pmix --with-pmix=/opt/ohpc/admin/pmix --with-pm=slurm --prefix=/opt/sw/packages/gcc-4_8/mvapich2/2.3.7-1
As far as I can tell, --with-pmi=pmi2 is getting ignored here, and if I try srun --mpi=pmi2, I get a segfault (mix/pmix_v4 works). Is there actually a way to build a copy that will work with either PMI2 or PMIx? It will be helpful for nudging users over the fence.
3. Also, I have both hwloc 1.11.8 and 2.9.0 installed. Not clear to me if it’s clearly better to use one or the other. I imagine the newer is preferred, but at our site, it’s loaded via a module, so a little less convenient.
Thanks!
--
#BlackLivesMatter
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB A555B, Newark
`'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20231122/9724ee11/attachment.html>
More information about the Mvapich-discuss
mailing list