[mvapich-discuss] mvapich2 and RDMA CM Address error

Bryan Putnam bfp at purdue.edu
Wed Nov 4 16:09:47 EST 2009


We've been seeing with both mvapich2-1.4rc2 and mvapich2-1.4 the following 
error when attempting to use a large number of processors, for example 
256 processors; 8 processors on each of 32 nodes.

coates-a029 1005% mpiexec -np 256 ./helloc
Hello from process 0!
Hello from process 1!
Hello from process 2!
Hello from process 3!
Hello from process 4!
Hello from process 5!
Hello from process 6!
Hello from process 7!
Hello from process 8!
Hello from process 9!
Hello from process 10!
Hello from process 11!
Hello from process 12!
[247] Abort: RDMA CM Address error: rdma cma event 1, error -110
 at line 341 in file rdma_cm.c


I just wanted to check and see if this was a know problem with a solution.

Thanks!
Bryan


More information about the mvapich-discuss mailing list