[mvapich-discuss] problems with MPI + GPU

Brian Budge brian.budge at gmail.com
Wed Jan 9 11:25:37 EST 2008


Hi DK -

I just rebuilt mvapich2 with tcp instead of ofa, and now my program reliably
executes.  I'll post something to the OFED list if I can find it.

Thanks,
  Brian

On Jan 8, 2008 9:11 PM, Dhabaleswar Panda <panda at cse.ohio-state.edu> wrote:

> Hi Brian,
>
> > Hi all -
> >
> > Sorry for all the traffic, but I'm getting very close to being able to
> > reliably run my application with mvapich2.
>
> Good to know this.
>
> > The problem I am having now is with GPUs.   I am running an application
> > which uses GPUs and the CUDA programming environment to accelerate
> > computation.  It's exciting stuff, and depending on the problem, I see 2
> to
> > 6x speedup (I am running a ray tracing type application).  Everything
> works
> > if I run without MPI, but if I run with mvapich2, my GPU initialization
> > fails about 75% of the time, making my runs quite unreliable.  In the
> 25%
> > when the device initializes, everything else works fine.
>
> Unfortunately, we have not tested MVAPICH2 + IB (OFED) + GPU (with CUDA).
> If anybody else in this list has experience in running MVAPICH2 in this
> mode, they can indicate their experience.
>
> You can also post a note regarding this to the OFED general list.
>
> > Now, I'm not sure what could possibly cause this, and I could see this
> > problem cropping up due to any of the following factors:
> >
> > 1) bug in mvapich2
> > 2) bug in CUDA
> > 3) bug in OFED IB stuff
> >
> > Does anyone have any ideas how to even begin tracking this down?  Could
> it
> > be something like infiniband device initialization walking into NVIDIA's
> > memory space?
> >   I'm grasping at straws here ;)
>
> Can you run basic MPICH2 (from Argonne) with Ethernet + GPU (with CUDA)?
> This will isolate IB-specific issues with IB/OFED and provide more
> insights to this problem.
>
> Thanks,
>
> DK
>
> > Thanks,
> >   Brian
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080109/5796db3e/attachment.html


More information about the mvapich-discuss mailing list