[mvapich-discuss] problem with mvapich2.ofa + CUDA

Matthew Koop koop at cse.ohio-state.edu
Thu Mar 13 13:27:16 EDT 2008


Brian,

You can try compiling MVAPICH2 with the -DDISABLE_PTMALLOC CFLAG in the
make.mvapich2.ofa script. It may be that our own malloc library is causing
problems with the cudaMalloc() that you are trying to use.

Let us know if this helps at all. Also, let us know if you have any
reproducers that we can look at as well.

Thanks,

Matt

On Tue, 11 Mar 2008, Brian Budge wrote:

> Hi all -
>
> I have an application which is using MPI over infiniband and which
> also uses CUDA on NVIDIA graphics cards.  The program can be
> configured with and without MPI (without limits the process to a
> single node), and which can run with GPUs and without GPUs.
>
> The problem appears when I am using MPI over IB and GPUs:
> Essentially, malloc of GPU memory fails in this configuration.  If I
> use mvapich2 over tcp this problem doesn't show up (even though I am
> running IP over IB).  Likewise, if I don't use GPUs, my program works
> fine.
>
> A bit more detail about the memory malloc failing:  The function is
> cudaMalloc(), available through the CUDA runtime libraries.  I can
> actually get these calls to succeed until a certain stage in my
> program, which happens to be after several dynamic libraries are
> opened via dlopen, and after spawning a thread.
>
> I am running mvapich2 in multithreaded mode, calling MPI_Init_thread()
> soon after main() is entered.
>
> I have tried some fairly minimal reproduction cases, and I can't seem
> to make them fail.  I may have to try something a bit more
> complicated.  However, in the meantime, can anyone suggest what might
> be broken?  Perhaps I've misconfigured mvapich with IB?
>
> I'm running gentoo linux with kernel version 2.6.24, and infiniband is
> built into the kernel along with the mellanox drivers (I have
> infinihost cards).
>
> Thanks for any suggestions,
>   Brian
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list