[mvapich-discuss] (no subject)

khaled hamidouche hamidouc at cse.ohio-state.edu
Fri Nov 27 08:32:54 EST 2015


Hi John,

Sorry to know that you are facing an issue with MV2-GDR. In order to help
in debugging your issue:

1) Can you please try with the latest MV2-GDR 2.2a and let us know if the
issue is still occurring
2) Can you please provide us a reproducer that we can debug the issue
locally ?

Thanks

On Fri, Nov 27, 2015 at 5:13 AM, John Donners <john.donners at surfsara.nl>
wrote:

> X-MS-Exchange
> Dear developers,
>
> I downloaded and installed the mvapich2-gdr-2.1 RPM.
> I'm getting an error in MPI_Finalize when running an application of one
> of our users:
>
> [gcn1:mpi_rank_0][ibv_cuda_unregister]
> src/mpid/ch3/channels/mrail/src/gen2/ibv_cuda_util.c:1543:
> cudaHostUnegister Failed: Invalid argument (22)
>
> and a backtrace of the resulting core dump gives:
>
> (gdb) bt
> #0  0x00002b58cf546eba in ?? () from /usr/lib64/libcuda.so.1
> #1  0x00002b58cf555954 in ?? () from /usr/lib64/libcuda.so.1
> #2  0x00002b58cf4c9b1f in ?? () from /usr/lib64/libcuda.so.1
> #3  0x00002b58cf48ffb2 in cuStreamDestroy_v2 () from
> /usr/lib64/libcuda.so.1
> #4  0x00002b58ccd2e4f0 in ?? () from
> /hpc/sw/cuda/7.0.28//lib64/libcudart.so.7.0
> #5  0x00002b58ccd6332d in cudaStreamDestroy () from
> /hpc/sw/cuda/7.0.28//lib64/libcudart.so.7.0
> #6  0x00002b58cd54bd31 in deallocate_cuda_rndv_streams () from
> /hpc/sw/mvapich2-gdr-2.1-cuda70-intel/lib64/libmpi.so.12
> #7  0x00002b58cd548677 in cuda_cleanup () from
> /hpc/sw/mvapich2-gdr-2.1-cuda70-intel/lib64/libmpi.so.12
> #8  0x00002b58cd4cd3c7 in MPID_Finalize () from
> /hpc/sw/mvapich2-gdr-2.1-cuda70-intel/lib64/libmpi.so.12
> #9  0x00002b58cd426be2 in PMPI_Finalize () from
> /hpc/sw/mvapich2-gdr-2.1-cuda70-intel/lib64/libmpi.so.12
> #10 0x00000000004093f9 in main (argc=1, argv=0x7fff67704158) at lbe.c:1301
>
> It looks at first sight an issue with the memory cleanup in mvapich2.
> Let me know if I can help with the further investigation of this issue.
>
> With regards,
> John
>
> HPC Center SURFsara, Amsterdam
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151127/40f769cf/attachment-0001.html>


More information about the mvapich-discuss mailing list