[mvapich-discuss] MPI_Iallreduce() Segfault Over 512 Processes

Derek Gaston friedmud at gmail.com
Thu Sep 29 03:14:06 EDT 2016


Hello all... I'm running into a segfault with MPI_Iallreduce().  It is
segfaulting when using over 512 processors (yes, exactly 512.  It works at
512 and segfaults at 513!).

It feels like MVAPICH is switching algorithms or something... and the one
it's switching too isn't happy!

I'm on an SGI ICE-X cluster with Mellanox ConnectX-3 ( MT27500 Family) FDR
Infiniband cards.

My test application is down at the bottom of this email.  Using it I've
found that MVAPICH2/2.0.1 and MVAPICH2/2.1 both segfault...
while MVAPICH2/1.9 does NOT.  I haven't tried 2.2 yet, but I'll try to do
that tomorrow.

Any advice?  Maybe there's a compile switch we missed or a runtime option I
should try?

Thanks for any help!

Derek


#include <mpi.h>





int main(int argc, char** argv)


{


  MPI_Init(&argc, &argv);





  double r = 1.2;


  double o;





  MPI_Request req;


  MPI_Status  stat;





  MPI_Iallreduce (&r, &o, 1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD, &req);





  MPI_Wait(&req, &stat);





  MPI_Finalize();


}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160929/f8a0ac26/attachment.html>


More information about the mvapich-discuss mailing list