[mvapich-discuss] Error registering memory with CUDA

Adam T. Moody moody20 at llnl.gov
Fri Jul 19 20:49:18 EDT 2013


Hello MVAPICH team,
Someone is running on a system using MVAPICH2-1.9 with CUDA enabled, but 
he is sometimes (90% of his runs) failing with the following error.

[edge42:mpi_rank_0][ibv_cuda_register] 
src/mpid/ch3/channels/mrail/src/gen2/ibv_cuda_util.c:704: 
cudaHostRegister Failed
> >>>> [edge42:mpi_rank_1][ibv_cuda_register] 
src/mpid/ch3/channels/mrail/src/gen2/ibv_cuda_util.c:704: 
cudaHostRegister Failed
> >>>> [edge63:mpi_rank_2][ibv_cuda_register] 
src/mpid/ch3/channels/mrail/src/gen2/ibv_cuda_util.c:704: 
cudaHostRegister Failed
> >>>> [edge63:mpi_rank_3][ibv_cuda_register] 
src/mpid/ch3/channels/mrail/src/gen2/ibv_cuda_util.c:704: 
cudaHostRegister Failed

Have you seen this before?  Do you know why it might happen?
Thanks,
-Adam


More information about the mvapich-discuss mailing list