[Mvapich-discuss] Excessive Memory Usage By Vbufs

Derek Gaston friedmud at gmail.com
Thu Jun 17 10:42:15 EDT 2021


I got an odd message that seemed like my email bounced.  So, just sending
this again to make sure it goes through.

---------- Forwarded message ---------
From: Derek Gaston <friedmud at gmail.com>
Date: Wed, Jun 16, 2021 at 3:33 PM
Subject: Excessive Memory Usage By Vbufs
To: <mvapich-discuss at cse.ohio-state.edu>


Hello all,

We're trying to track down an issue that we can see with MVAPICH 2.3.5, but
not with OpenMPI.

What's happening is that sending _many_ small messages with isend or issend
is causing allocate_vbuf_pool to grow to incredibly large and not be
released until MPI_Finalize.  My suspicion is that the messages are small
enough that eager sends are creating temporary buffers that are not being
freed once the send is completed (seems like that buffer should get freed
with an MPI_Wait).

To test this out I wrote a tiny little C++ program that you can find here:
https://urldefense.com/v3/__https://gist.github.com/friedmud/9533d5997f06414c25f8c5c57a1eaf37__;!!KGKeukY!kJAyBUCMODuiN1Hj7XHPleuCMx-_JNRgpAIEmYxmDXRpAiYOHWxuu4-UYqzKFZHmo6JF0CrmNw$  (Need a
C++11 compliant compiler to compile it)

The configuration parameters are all at the top - and what it does is send
an array of doubles to every other process on COMM_WORLD.  Nothing
earth-shattering.

You can see the results below when running on 576 procs (and using
Gperftools instrumentation to check the memory usage for one process).
What's happening is that for message sizes of less than 2000 doubles (less
than 128k) allocate_vbuf_pool is using a large amount of memory.  Once the
message size goes over 2000 doubles then the memory drops back down (in my
theory: because then the buffer is used directly instead of being copied to
a temporary buffer for eager sending).

Note that the memory is being reported just before and after MPI_Finalize.
Finalize seems to release all of the memory... so it's not being "lost"...
it's just not getting freed up once the send is done (and maybe not being
reused well enough?).

Any suggestions here?

Thanks!

Derek



MPI type Num procs (sent-received) Message size Initial Before MPI_Finalize
(MB) Final Top function
MVAPICH 576 (57500) 100 0 48.4 0 allocate_vbuf_pool
MVAPICH 576 (57500) 1000 0 534.1 0 allocate_vbuf_pool
MVAPICH 576 (57500) 10000 0 68 0 MPIU_Handle_indirect_init
MVAPICH 576 (57500) 100000 0 68.1 0 MPIU_Handle_indirect_init
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20210617/a384f500/attachment-0021.html>


More information about the Mvapich-discuss mailing list