[mvapich-discuss] Error registering memory with CUDA

sreeram potluri potluri at cse.ohio-state.edu
Fri Jul 19 21:46:18 EDT 2013


Hi Adam,

I have seen this error earlier when a user tries to share a GPU between two
processes but the GPU is set in thread exclusive or process exclusive mode.
Can you check with the user if this is the case?

This can also happen in other cases like when devices are not iniitalized
properly using deviceQuery. However, I suspect that earlier is the case.

Best
Sreeram Potluri

On Fri, Jul 19, 2013 at 8:49 PM, Adam T. Moody <moody20 at llnl.gov> wrote:

> Hello MVAPICH team,
> Someone is running on a system using MVAPICH2-1.9 with CUDA enabled, but
> he is sometimes (90% of his runs) failing with the following error.
>
> [edge42:mpi_rank_0][ibv_cuda_**register] src/mpid/ch3/channels/mrail/**src/gen2/ibv_cuda_util.c:704:
> cudaHostRegister Failed
>
>> >>>> [edge42:mpi_rank_1][ibv_cuda_**register]
>>
> src/mpid/ch3/channels/mrail/**src/gen2/ibv_cuda_util.c:704:
> cudaHostRegister Failed
>
>> >>>> [edge63:mpi_rank_2][ibv_cuda_**register]
>>
> src/mpid/ch3/channels/mrail/**src/gen2/ibv_cuda_util.c:704:
> cudaHostRegister Failed
>
>> >>>> [edge63:mpi_rank_3][ibv_cuda_**register]
>>
> src/mpid/ch3/channels/mrail/**src/gen2/ibv_cuda_util.c:704:
> cudaHostRegister Failed
>
> Have you seen this before?  Do you know why it might happen?
> Thanks,
> -Adam
> ______________________________**_________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-**state.edu <mvapich-discuss at cse.ohio-state.edu>
> http://mail.cse.ohio-state.**edu/mailman/listinfo/mvapich-**discuss<http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20130719/5de360cb/attachment.html


More information about the mvapich-discuss mailing list