[mvapich-discuss] GPU affinity and clusters with multi-GPU nodes

Jonathan Perkins perkinjo at cse.ohio-state.edu
Wed Apr 30 11:51:57 EDT 2014


We discussed this issue off-line. In this application, a single MPI
process is trying to use multiple
GPU devices. This is not currently supported in MVAPICH2. We are
closing this request and updating
the list for everybody's information.

On Tue, Apr 29, 2014 at 4:14 AM, Rustico, Eugenio
<eugenio.rustico at baw.de> wrote:
> Sure:
>
> $ mpirun -outfile-pattern $LOGFILE -prepend-rank -hostfile hostfile_gpusph
> -np $NPROCS $COMMAND
> (hostfile_gpusph contains 8 hosts connected with IB)
>
> $ ll $(which mpirun)
> mpirun -> mpiexec.hydra
>
> $ grep MV2 $LOGFILE
> [1] NetworkManager: MV2_COMM_WORLD_LOCAL_RANK is 0
> [2] NetworkManager: MV2_COMM_WORLD_LOCAL_RANK is 0
> [3] NetworkManager: MV2_COMM_WORLD_LOCAL_RANK is 0
> [0] NetworkManager: MV2_COMM_WORLD_LOCAL_RANK is 0
>
> Please note that even if MV2_COMM_WORLD_LOCAL_RANK was non-zero, it would be
> pretty difficult to set the CUDA device before MPI is iniitalized, due to
> the current encapsulation. Also, the CUDA device list is user defined
> through the command line.
>
> Staging the transfers on host makes it work. I also wonder if
> MV2_ENABLE_AFFINITY has any influence on this? If it is of any help, MPI is
> initalized with
>
>   MPI_Init_thread(NULL, NULL, MPI_THREAD_MULTIPLE, &result);
>
> Thank you,
> Eugenio Rustico
>
>
>> Jonathan Perkins <perkinjo at cse.ohio-state.edu> hat am 28. April 2014 um
>> 20:08 geschrieben:
>>
>>
>> Thanks for the note. It's surprising that MV2_COMM_WORLD_LOCAL_RANK
>> is always being detected as 0. Can you please share how you are
>> launching the jobs (which launcher are you using in particular)?
>>
>> On Mon, Apr 28, 2014 at 9:07 AM, Rustico, Eugenio
>> <eugenio.rustico at baw.de> wrote:
>> > Hello,
>> >
>> > I work on a cluster of 2-GPU nodes featuring MVAPICH2-1.9. I have one
>> > thread
>> > for each device and arbitrary pairs of devices need to exchange data
>> > over
>> > the network. Device buffers pointers are passed directly.
>> >
>> > If I run a 4-GPUs simulation over 2 nodes, no error is encountered. Same
>> > if
>> > I run a single-GPU, multiple nodes simulation with up to 8 nodes.
>> > However,
>> > as soon as I run a multi-GPU simulation over 3 or more nodes (so 3 * 2,
>> > 4 *
>> > 2 and so on) it crashes with:
>> >
>> > [MPIDI_CH3I_MRAILI_Process_cuda_finish]
>> > src/mpid/ch3/channels/mrail/src/gen2/ibv_cuda_rndv.c:865:
>> > cudaEventRecord
>> > failed
>> >
>> > I read that setting the CUDA device after MPI_Init() is supported only
>> > from
>> > 2.0 on and if I evaluate MV2_COMM_WORLD_LOCAL_RANK, this is always 0. My
>> > guess that the problem is a wrong GPU affinity, i.e. MVAPICH tries to
>> > use
>> > the wrong GPU.
>> >
>> > Is there any way to use multiple GPUs with version 1.9, e.g. setting an
>> > environment variable? Otherwise, I guess I will have to stage transfers
>> > on
>> > host and adding a cudaMemcpy() after each transfer.
>> >
>> > Thanks,
>> > Eugenio Rustico
>> >
>> > _______________________________________________
>> > mvapich-discuss mailing list
>> > mvapich-discuss at cse.ohio-state.edu
>> > http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>> >
>>
>>
>>
>> --
>> Jonathan Perkins
>> http://www.cse.ohio-state.edu/~perkinjo
>>
>>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list