[mvapich-discuss] nv_peer_mem and gdrcopy
Le, Viet Duc
vdle at moasys.com
Wed Aug 12 04:40:33 EDT 2020
Hi,
I appreciate clarifications regarding installation and usage of
mvapich2-gdr.
1. NVIDIA Peer Mem.
http://mvapich.cse.ohio-state.edu/userguide/gdr/#_system_requirements
NVIDIA Peer Memory is listed as a requirement. We conducted tests
without nv_peer_mem.ko and neither a warning or an error message was
emitted by mvapich2-gdr.
The library itself is not described in detail by NVIDIA.
- Does mvapich2-gdr quietly fall back to generic mvapich2 with the
absence of nv_peer_mem ?
- Is there MV2_* variables beside MV2_SHOW_ENV_INFO that can give more
diagnostic messages ?
- Could you update the link containing the list of devices supported
GPUDirect ? It simply re-direct toward Mellanox homepage.
2. GDRCopy Interoperability:
http://mvapich.cse.ohio-state.edu/userguide/gdr/#_strongly_recommended_system_features
GDRCopy works as a standalone library without nv_peer_mem.ko. For
instance, the internal tests-sanity, copybw, copylat-produced expected
output on our system.
The following are printed via MV2_SHOW_ENV_INFO:
MV2_USE_GDRCOPY : 2
MV2_GDRCOPY_LIMIT : 8192
MV2_GDRCOPY_NAIVE_LIMIT: 8192
The above information did not indicate if GDRCOPY was actually
employed.
- Is there a way we can confirm whether GDRCopy is used by mvapich2 ? A
warning message or more preferably, outright termination, would be helpful.
3. Loopback feature:
The following are printed via MV2_SHOW_ENV_INFO:
MV2_USE_GPUDIRECT_LOOPBACK : 1
MV2_USE_GPUDIRECT_LOOPBACK_LIMIT : 8192
MV2_USE_GPUDIRECT_LOOPBACK_NAIVE_LIMIT : 8192
As I understand, the following trend is implied: mvapich2 <
mvapich2-gdr + loopback < mvapich2-gdr + loopback + gdrcopy
Our impression is that loopback requires nv_peer_mem.ko to function.
- Is there a way we can distinguish whether loopback or gdrcopy is used
?
- Could you share some references related to loopback features ?
Bottom line is the module works, but which features are being used remains
elusive to us. Thus, we cannot establish a baseline for benchmarking
purposes.
Regards.
Viet-Duc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20200812/95745e1a/attachment.html>
More information about the mvapich-discuss
mailing list