[mvapich-discuss] MVAPICH 2-GDR Issue with MPI_Allreduce

Panda, Dhabaleswar panda at cse.ohio-state.edu
Mon Jan 13 22:35:50 EST 2020


Hi Andreas,

Thanks for the update. We are taking a look at it and keep you updated.

Thanks,

DK

________________________________________
From: mvapich-discuss-bounces at cse.ohio-state.edu <mvapich-discuss-bounces at mailman.cse.ohio-state.edu> on behalf of Herten, Andreas <a.herten at fz-juelich.de>
Sent: Monday, January 13, 2020 10:49 AM
To: mvapich-discuss at cse.ohio-state.edu
Cc: Breuer, Thomas; Markus Schmitt; Hater,      Thorsten
Subject: Re: [mvapich-discuss] MVAPICH 2-GDR Issue with MPI_Allreduce

Dear all,

I’ve tested the minimal producer with MVAPICH-GDR 2.2.3 on our JUWELS machine. In general, it looks better, but it is still not functioning properly.
I updated my write-up from last time with a section on MVAPICH-GDR 2.2.3 at the bottom:
https://gist.github.com/AndiH/b929b50b4c8d25137e0bfee25db63791#experiments-with-mvapich-233-gdr<https://urldefense.com/v3/__https://gist.github.com/AndiH/b929b50b4c8d25137e0bfee25db63791*experiments-with-mvapich-233-gdr__;Iw!!KGKeukY!jaIM8xGNoJNXTxq-u_4wPlPIS_K01CYcFquAMtrI83yCeLDmwxwLo1dP0Sd3WROp5xPuon4_kve6nG8$>

Best,

-Andreas
—
NVIDIA Application Lab
Jülich Supercomputing Centre
Forschungszentrum Jülich, Germany
+49 2461 61 1825

Am 08.01.2020 um 11:33 schrieb Herten, Andreas <a.herten at fz-juelich.de<mailto:a.herten at fz-juelich.de>>:

Dear MV2 Support,

We see an issue when calling MPI_Allreduce on GPU memory buffers. The reduction produces wrong results or the program even seg faults.
I wrote up a more detailed description including a minimal reproducing example and some of our experiments here:
https://gist.github.com/AndiH/b929b50b4c8d25137e0bfee25db63791<https://urldefense.com/v3/__https://gist.github.com/AndiH/b929b50b4c8d25137e0bfee25db63791__;!!KGKeukY!jaIM8xGNoJNXTxq-u_4wPlPIS_K01CYcFquAMtrI83yCeLDmwxwLo1dP0Sd3WROp5xPuon4_9RBsC4I$>

Is this a bug?

New year’s greetings,

-Andreas
—
NVIDIA Application Lab
Jülich Supercomputing Centre
Forschungszentrum Jülich, Germany
+49 2461 61 1825





More information about the mvapich-discuss mailing list