[mvapich-discuss] avoiding the QPI bus

Fri Apr 15 06:43:26 EDT 2016

Hi all,
in our cluster nodes we have several NVIDIA GPUs and one Mellanox IB
card for each PCIe root complex of the two sockets.
Since we know about the bad performances of MPI transfers through the
QPI bus, we would like to test the performances of our multi-GPU codes
in different scenarios.

In general, in our codes, each MPI process running on a CPU core of a
specific socket, controls one GPU attached to the PCIe root complex
connected to the same socket.
Kernels running on GPUs need to exchange data with other GPUs (connected
to the same PCIe root complex, connected to the PCIe root complex of the
other socket or in another compute node), using GPUDirect-RDMA.

We would like to test performances in a couple of scenarios:

Scenario A) GPUs attached to different PCIe root complexes performs
intra-node-iter-socket communications through the QPI bus.

Scenario B) GPUs attached to different PCIe root complexes performs
intra-node-inter-socket communications through the Infiniband Fabrics
(as shown in this pic:
http://www.cirrascale.com/blog/wp-content/uploads/2014/08/PCIe-Block-Diagram-Typical-8-GPU-System-720x485.png)

Is there a way to control the use of the QPI bus and a way to report if
it is used or not, when using the MVAPICH2-GDR MPI library?

Thanks in advance and Best Regards,

Enrico

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160415/b4f0e9cb/attachment.sig>