[mvapich-discuss] [SECURITY WARNING: FREE E-MAIL] Re: Performance of CUDA Managed Memory and DeviceMemory for GDR 2.3a
Yussuf Ali
Yussuf.ali at jaea.go.jp
Fri Jan 12 02:08:04 EST 2018
Hi Ammar,
Thank you for your answer!
It is a x86 system with two nodes.
The output of “nvidia-smi topo -m” on both nodes is as follows:
|Node1|
---
# nvidia-smi topo -m
GPU0 GPU1 GPU2 GPU3 mlx5_0 mlx5_1 CPU Affinity
GPU0 X NV1 NV1 NV2 PIX SOC 0-13
GPU1 NV1 X NV2 NV1 PIX SOC 0-13
GPU2 NV1 NV2 X NV1 SOC PIX 14-27
GPU3 NV2 NV1 NV1 X SOC PIX 14-27
mlx5_0 PIX PIX SOC SOC X SOC
mlx5_1 SOC SOC PIX PIX SOC X
Legend:
X = Self
SOC = Connection traversing PCIe as well as the SMP link between CPU sockets(e.g. QPI)
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge)
PIX = Connection traversing a single PCIe switch
NV# = Connection traversing a bonded set of # NVLinks
---
|Node2|
---
# nvidia-smi topo -m
GPU0 GPU1 GPU2 GPU3 mlx5_0 mlx5_1 CPU Affinity
GPU0 X NV1 NV1 NV2 PIX SOC 0-13
GPU1 NV1 X NV2 NV1 PIX SOC 0-13
GPU2 NV1 NV2 X NV1 SOC PIX 14-27
GPU3 NV2 NV1 NV1 X SOC PIX 14-27
mlx5_0 PIX PIX SOC SOC X SOC
mlx5_1 SOC SOC PIX PIX SOC X
Legend:
X = Self
SOC = Connection traversing PCIe as well as the SMP link between CPU sockets(e.g. QPI)
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge)
PIX = Connection traversing a single PCIe switch
NV# = Connection traversing a bonded set of # NVLinks
Thank you for your help,
Yussuf
From: Ammar Ahmad Awan
Sent: Friday, January 12, 2018 12:48 AM
To: Yussuf Ali
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: [SECURITY WARNING: FREE E-MAIL] Re: [mvapich-discuss] Performance of CUDA Managed Memory and DeviceMemory for GDR 2.3a
===( By JAEA Mail System )===============================
URL中の文字「:」を「*」に置換しました。
Characters of ":" in URL have been replaced with "*".
=========================================================
Hi Yussuf,
Can you please share the details of your system? Is this an OpenPOWER or an x86 system?
It will be helpful if you can share the output of 'nvidia-smi topo -m' as well.
Regards,
Ammar
On Thu, Jan 11, 2018 at 2:18 AM, Yussuf Ali <Yussuf.ali at jaea.go.jp> wrote:
Dear MVAPICH2 developers and users,
I measured the intra node performance of our GPU cluster system(4 x NVIDIA Tesla P100-SXM2-16GB, CUDA 8.0) with the osu bi-directional bandwidth benchmark with the current MVAPICH-GDR 2.3a version.
I executed the benchmark for:
Device Memory <-> Device Memory
and
Managed Memory <-> Managed Memory
The following environment variables were set during both benchmarks in the PBS script:
_______________________________________________
export MV2_USE_CUDA=1
export MV2_GPUDIRECT_GDRCOPY_LIB=./libgdrapi.so
export MV2_USE_GPUDIRECT=1
export MV2_GPUDIRECT_GDRCOPY=1
export MV2_USE_GPUDIRECT_GDRCOPY=1
export MV2_CUDA_IPC=1
export MV2_CUDA_ENABLE_MANAGED=1
export MV2_CUDA_MANAGED_IPC=1
I obtained the following results:
M<->M D<->D
1 3.1 1.1
2 6.1 2.2
4 12.3 4.4
8 24.6 8.9
16 49.3 17.4
32 95.3 17.2
64 182.0 34.0
128 373.7 67.3
256 663.5 130.9
512 1,211.0 250.0
1,024 1,927.6 406.9
2,048 2,490.1 653.1
4,096 3,116.4 488.6
8,192 5,528.9 481.6
16,384 8,980.7 2,528.6
32,768 1,118.2 6,553.0
65,536 2,178.6 12,729.1
131,072 4,026.9 18,738.3
262,144 6,930.5 26,631.6
524,288 10,566.6 28,645.9
1,048,576 9,229.6 32,114.8
2,097,152 8,908.8 32,776.5
4,194,304 8,818.7 33,884.9
It seems that for messages sizes up to 16,384 bytes Managed Memory performs better than Device Memory.
For message sizes larger or equal to 32,768 bytes Device Memory achieves a higher performance.
Is there a way to tune Managed Memory performance in order to get the same performance
as Device Memory for messages sizes larger or equal to 32,768 bytes? Because for convenience we
would like to use CUDA Managed Memory.
Thank you for your help,
Yussuf
_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu
http*//mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20180112/2d24d65c/attachment-0001.html>
More information about the mvapich-discuss
mailing list