[mvapich-discuss] avoiding the QPI bus

khaled hamidouche hamidouc at cse.ohio-state.edu
Sat Apr 16 07:23:05 EDT 2016


Hi Enrico,

We will investigate to see what could be going-on here and how we can
provide such support.

Thanks

On Fri, Apr 15, 2016 at 10:33 AM, Enrico Calore <enrico.calore at fe.infn.it>
wrote:

> On 04/15/2016 03:51 PM, khaled hamidouche wrote:
> > Dear Enrico,
> >
> > Assuming you do not have any Host-Host communication,
>
> what do you mean by host-host?
>
> Actually in our codes all of the GPUs are organized in a logical ring,
> where each one has to communicate with both its left and right neighbor;
> thus we have all kind of communications... intra-socket, inter-socket
> and inter-node.
>
> With reference to this picture:
>
> http://www.cirrascale.com/blog/wp-content/uploads/2014/08/PCIe-Block-Diagram-Typical-8-GPU-System-720x485.png
>
> For example, if GPU0 need to exchange data with GPU4 how could we
> control the data-path to be used? i.e. through QPI or through HCAs?
>
> > then to achieve
> > your goal just disable shared memory. MV2_USE_SHARED_MEM=0. This will
> > ensure that all the communication goes through HCA and not through QPI.
>
> We already tried setting this variable, but although the code does run
> correctly, MPI communications occurring between GPU0 (attached to CPU
> socket 0) and GPU4 (attached to CPU socket 1) does not produce any data
> flowing in the Infiniband fabric. (Monitoring the IB switch we do not
> see any data flowing, thus the QPI bus had to be used).
>
> > But just to clarify, MV2-GDR does use automatic selection of HCA  to
> > avoid the usage of GPUDirect RDMA code path through QPI at all, So if
> > need to use QPI, we first copy to the host and then use QPI (which is
> > not a big limitation)
>
> Does this holds true also if the communication is occurring between GPUs
> attached to different sockets (i.e. PCIe root complexes) inside the same
> node/machine?
>
>
> Thanks and Best Regards,
>
> Enrico
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160416/057f85ea/attachment-0001.html>


More information about the mvapich-discuss mailing list