[Mvapich-discuss] Handling MVAPICH 3.0 full subscription warning
You, Zhi-Qiang
zyou at osc.edu
Thu Oct 31 12:11:04 EDT 2024
Hi Nat,
Thank you for the suggestion. I have a few questions:
1. Does this message indicate that oversubscription is occurring, or is it simply a warning that appears every time a full-node job is run? In one user’s case, I did not observe any oversubscription, although the warning was present.
2. UCX is also the default for OpenMPI, but I did not see a similar warning when running a full-node job with OpenMPI. Why does this only happen with MVAPICH?
Thank you,
ZQ
From: Shineman, Nat <shineman.5 at osu.edu>
Date: Wednesday, October 16, 2024 at 2:16 PM
To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>, You, Zhi-Qiang <zyou at osc.edu>
Subject: Re: Handling MVAPICH 3.0 full subscription warning
Hi ZQ,
You are probably seeing degraded performance because you are still running the application at full subscription and requesting that MVAPICH reserve 2 cores per process. The warning should probably more accurately state that you should cap your runs at 1/2 subscription and set the listed environment variable. This would prevent you from oversubscribing cores.
However, if you are seeing satisfactory performance with oversubscribed cores in full subscription, please feel free to ignore the warning.
Thanks,
Nat
________________________________
From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> on behalf of You, Zhi-Qiang via Mvapich-discuss <mvapich-discuss at lists.osu.edu>
Sent: Wednesday, October 16, 2024 11:50
To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>
Subject: [Mvapich-discuss] Handling MVAPICH 3.0 full subscription warning
Hi,
We encountered the following warning message while running a full-node MPI job with MVAPICH 3.0:
[][mvp_generate_implicit_cpu_mapping] WARNING: You appear to be running at full subscription for this job. UCX spawns an additional thread for each process which may result in oversubscribed cores and poor performance. Please consider reserving at least 2 cores per node for the additional threads, enabling SMT, or setting MVP_THREADS_PER_PROCESS=2 to ensure that sufficient resources are available.
The suggestion to set MVP_THREADS_PER_PROCESS=2 not only fails to improve performance but actually degrades it. Can this warning message be safely ignored, or is there any action I need to take to address it?
Best,
ZQ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20241031/10492c20/attachment-0002.html>
More information about the Mvapich-discuss
mailing list