[mvapich-discuss] GPU affinity and clusters with multi-GPU nodes

Tue Apr 29 04:14:14 EDT 2014

Sure:

$ mpirun -outfile-pattern $LOGFILE -prepend-rank -hostfile hostfile_gpusph -np
$NPROCS $COMMAND
(hostfile_gpusph contains 8 hosts connected with IB)

$ ll $(which mpirun)
mpirun -> mpiexec.hydra

$ grep MV2 $LOGFILE
[1] NetworkManager: MV2_COMM_WORLD_LOCAL_RANK is 0
[2] NetworkManager: MV2_COMM_WORLD_LOCAL_RANK is 0
[3] NetworkManager: MV2_COMM_WORLD_LOCAL_RANK is 0
[0] NetworkManager: MV2_COMM_WORLD_LOCAL_RANK is 0

Please note that even if MV2_COMM_WORLD_LOCAL_RANK was non-zero, it would be
pretty difficult to set the CUDA device before MPI is iniitalized, due to the
current encapsulation. Also, the CUDA device list is user defined through the
command line.

Staging the transfers on host makes it work. I also wonder if
MV2_ENABLE_AFFINITY has any influence on this? If it is of any help, MPI is
initalized with

  MPI_Init_thread(NULL, NULL, MPI_THREAD_MULTIPLE, &result);

Thank you,
Eugenio Rustico

> Jonathan Perkins <perkinjo at cse.ohio-state.edu> hat am 28. April 2014 um 20:08
> geschrieben:
>
>
> Thanks for the note. It's surprising that MV2_COMM_WORLD_LOCAL_RANK
> is always being detected as 0. Can you please share how you are
> launching the jobs (which launcher are you using in particular)?
>
> On Mon, Apr 28, 2014 at 9:07 AM, Rustico, Eugenio
> <eugenio.rustico at baw.de> wrote:
> > Hello,
> >
> > I work on a cluster of 2-GPU nodes featuring MVAPICH2-1.9. I have one thread
> > for each device and arbitrary pairs of devices need to exchange data over
> > the network. Device buffers pointers are passed directly.
> >
> > If I run a 4-GPUs simulation over 2 nodes, no error is encountered. Same if
> > I run a single-GPU, multiple nodes simulation with up to 8 nodes. However,
> > as soon as I run a multi-GPU simulation over 3 or more nodes (so 3 * 2, 4 *
> > 2 and so on) it crashes with:
> >
> > [MPIDI_CH3I_MRAILI_Process_cuda_finish]
> > src/mpid/ch3/channels/mrail/src/gen2/ibv_cuda_rndv.c:865: cudaEventRecord
> > failed
> >
> > I read that setting the CUDA device after MPI_Init() is supported only from
> > 2.0 on and if I evaluate MV2_COMM_WORLD_LOCAL_RANK, this is always 0. My
> > guess that the problem is a wrong GPU affinity, i.e. MVAPICH tries to use
> > the wrong GPU.
> >
> > Is there any way to use multiple GPUs with version 1.9, e.g. setting an
> > environment variable? Otherwise, I guess I will have to stage transfers on
> > host and adding a cudaMemcpy() after each transfer.
> >
> > Thanks,
> > Eugenio Rustico
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >
>
>
>
> --
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140429/ef46f005/attachment.html>