[mvapich-discuss] MVAPICH-GDR 2.3.3: Bug using Multiple Nodes

Herten, Andreas a.herten at fz-juelich.de
Thu Jan 23 08:59:12 EST 2020


Dear all,

As Hari already mentioned, the MPI_Allreduce() bug reported before was fixed with the latest built of the RPM. Thanks again for the swift response!

Unfortunately, going forward with our test case at hand, we encountered another bug – and quite a serious one. We cannot launch an MPI program on more than one node; `srun --nodes 1 ./test` works, but `srun --nodes 2 ./test` does not.

As before, please find a description of the problem in this Gist, including a reproducer:
	https://urldefense.com/v3/__https://gist.github.com/AndiH/cf1c0ec5110170526ad345c0ce82f74b*mvapich2-gdr-multi-node-mpi-bug__;Iw!!KGKeukY!hBYVKeo3AyxsqlLF0bkq7oI_G1W4ZvXBREKO-kV1lYlCRkrtJnTnIFEViJXMyoYnFsf1iaAO8fyYxYk$ 

Please make sure to have a look at the note at the end of the readme relating to our OFED stack update next week.

Best,

-Andreas
—
NVIDIA Application Lab
Jülich Supercomputing Centre
Forschungszentrum Jülich, Germany
+49 2461 61 1825

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20200123/70b40603/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5229 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20200123/70b40603/attachment-0001.p7s>


More information about the mvapich-discuss mailing list