[mvapich-discuss] Problem with mvapich2 using CUDA

Michael Haidl michael.haidl at gmx.de
Tue Feb 7 12:56:55 EST 2012


I have the following problem:

I am running a simulation with mvapich2 and CUDA support on 2 nodes. 
Each node has 5 GPUs. The nodes are connected via InfiniBand. With a 
small problem size the simulation must send 1,2MB from process to 
process every loop (~76 loops per second). This works! If I increase the 
problem size, which also increases the amount of data transferred (now: 
15,2 MB per loop with ~ 3 loops per second) if get the following:

[4] Abort: Cuda Stream Creation failed
  at line 73 in file ibv_cuda_stream.c

reproduce able after ~ 740 loops.

I tried MV2_CUDA_EVENT_SYNC to with the same problem but Event Creation 
failed not Stream Creation.

My start-up command looks like this:
mpirun_rsh -hostfile hosts -np 10 MV2_USE_CUDA=1 MV2_ENABLE_AFFINITY=1 
./sim.x --mpi

Any advice would be highly appreciated.

Michael Haidl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120207/6e1941ae/attachment.html


More information about the mvapich-discuss mailing list