[mvapich-discuss] problems with MPI + GPU
Dhabaleswar Panda
panda at cse.ohio-state.edu
Wed Jan 9 00:11:16 EST 2008
Hi Brian,
> Hi all -
>
> Sorry for all the traffic, but I'm getting very close to being able to
> reliably run my application with mvapich2.
Good to know this.
> The problem I am having now is with GPUs. I am running an application
> which uses GPUs and the CUDA programming environment to accelerate
> computation. It's exciting stuff, and depending on the problem, I see 2 to
> 6x speedup (I am running a ray tracing type application). Everything works
> if I run without MPI, but if I run with mvapich2, my GPU initialization
> fails about 75% of the time, making my runs quite unreliable. In the 25%
> when the device initializes, everything else works fine.
Unfortunately, we have not tested MVAPICH2 + IB (OFED) + GPU (with CUDA).
If anybody else in this list has experience in running MVAPICH2 in this
mode, they can indicate their experience.
You can also post a note regarding this to the OFED general list.
> Now, I'm not sure what could possibly cause this, and I could see this
> problem cropping up due to any of the following factors:
>
> 1) bug in mvapich2
> 2) bug in CUDA
> 3) bug in OFED IB stuff
>
> Does anyone have any ideas how to even begin tracking this down? Could it
> be something like infiniband device initialization walking into NVIDIA's
> memory space?
> I'm grasping at straws here ;)
Can you run basic MPICH2 (from Argonne) with Ethernet + GPU (with CUDA)?
This will isolate IB-specific issues with IB/OFED and provide more
insights to this problem.
Thanks,
DK
> Thanks,
> Brian
>
More information about the mvapich-discuss
mailing list