[mvapich-discuss] MVAPICH 2.0a

Thu Sep 12 11:00:35 EDT 2013

Hi Oliver,

1) You are right regarding the first feature. It enables applications to
select the GPU device before or after MPI_Init. This was based on several
requests from application developers that they had to do some arbitration
after MPI_Init before they can select a GPU.

2) With MVAPICH2 2.0a, if you do not want to initialize the device at a
process, just don't select it in the application (by not calling
cudaSetDevice or other CUDA calls that implicitly initialize a context).
The MPI library will not do any GPU specific initialization or operations
until the application has selected/initialized one.

3) Regarding the second feature, jobs can now run across nodes with GPUs
and without GPUs. In earlier versions of MVAPICH2, there was a requirement
that every process needs to have access to a GPU i.e. all nodes should have
a GPU. With MVAPICH2 2.0a, we have removed this requirement.

I hope this clears your questions.

Best
Sreeram Potluri

On Thu, Sep 12, 2013 at 8:54 AM, Oliver Fuhrer (MeteoSwiss) <
oliver.fuhrer at ginko.ch> wrote:

> Dear MVAPICH2 team,
>
> I am intersted in the two features of MVAPICH 2.0a
> – (NEW) Dynamic CUDA initialization. Support GPU device selection after
> MPI Init
> – (NEW) Support for running on heterogeneous clusters with GPU and non-GPU
> nodes
>
> My questions would be the following:
>
> 1) From our Fortran code we currently do the following:
>
> ! get total number of ranks and current rank from environment variables
> call getenv("MV2_COMM_WORLD_SIZE", snumid)
> call getenv("MV2_COMM_WORLD_RANK", smyid)
>
> ! set device for CUDA runtime
> ierr = cudaSetDevice(mydev)
>
> ! set device for OpenACC runtime
> call acc_set_device_num(mydev, acc_device_nvidia)
> call acc_init(acc_device_nvidia)
>
> ! initialize MPI library
> call MPI_Init(ierr)
>
> Can we simply switch the order of the MPI_Init and device initialization
> statements and avoid using the environment variables by querying
> MPI_Comm_Size and MPI_Comm_Rank instead?
>
> 2) How can we avoid device initialization from a specific MPI rank?
>
> 3) What exactly does the second bullet mean? I thought that - for example
> - sending messages from a GPU sender buffer to a CPU receiver buffer on
> another rank was already possible.
>
> Kind regards,
> Oli
>
> _________
>
> Oliver Fuhrer
> Numerical Models
>
> Federal Departement of Home Affairs FDHA
> Federal Office of Meteorology and Climatology MeteoSwiss
>
> Kraehbuehlstrasse 58, P.O. Box 514, CH-8044 Zurich, Switzerland
>
> Tel. +41 44 256 93 59
> Fax +41 44 256 92 78
> oliver.fuhrer at meteoswiss.ch
> www.meteoswiss.ch - First-hand information
>
>
>
>
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20130912/9d134aba/attachment-0001.html