[mvapich-discuss] Which mvapich version to install on a GPU cluster without Mellanox OFED ?

Subramoni, Hari subramoni.1 at osu.edu
Wed Apr 18 06:22:24 EDT 2018


Hi, Yussuf.

You can use the optimized MVAPICH2-GDR for single node application runs even if Mellanox hardware is not present. This should give you the best performance within one node.

Could you please let us know the answer to the following questions

a.       What sort of interconnect the system has?

b.      What version of OFED is available on the system?

c.       Is GDRCopy available on the system?

a.     https://github.com/NVIDIA/gdrcopy

This will enable us to help you better.

Best Regards,
Hari.

From: mvapich-discuss-bounces at cse.ohio-state.edu On Behalf Of Yussuf Ali
Sent: Wednesday, April 18, 2018 12:20 AM
To: mvapich-discuss at cse.ohio-state.edu <mvapich-discuss at mailman.cse.ohio-state.edu>
Subject: [mvapich-discuss] Which mvapich version to install on a GPU cluster without Mellanox OFED ?

Dear Mvapich usergroup,

we have the following GPU cluster system with 8 GPUs (GeForce 1080Ti) but without any Mallanox hardware.

Out goal is to use MPI so send data between different GPUs directly from CUDA device buffers.
Is this possible with any Mvapich version and this particular GPU hardware cluster?

Output from: nvidia-smi topo -m

          GPU0          GPU1         GPU2          GPU3          GPU4          GPU5          GPU6          GPU7         CPU-Affinity
GPU0          X       PIX    PHB   PHB   SYS    SYS    SYS    SYS    0-7,16-23
GPU1          PIX    X       PHB   PHB   SYS    SYS    SYS    SYS    0-7,16-23
GPU2          PHB   PHB  X       PIX    SYS    SYS    SYS    SYS    0-7,16-23
GPU3          PHB   PHB  PIX    X       SYS    SYS    SYS    SYS    0-7,16-23
GPU4          SYS    SYS   SYS    SYS    X       PIX    PHB   PHB   8-15,24-31
GPU5          SYS    SYS   SYS    SYS    PIX    X       PHB   PHB   8-15,24-31
GPU6          SYS    SYS   SYS    SYS    PHB   PHB   X       PIX    8-15,24-31
GPU7          SYS    SYS   SYS    SYS    PHB   PHB   PIX    X       8-15,24-31


X    = Self
SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB  = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge)
PIX  = Connection traversing a single PCIe switch
NV#  = Connection traversing a bonded set of # NVLinks

Thank you for your help,
Yussuf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 14152 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20180418/d5cefb95/attachment-0001.bin>


More information about the mvapich-discuss mailing list