[mvapich-discuss] PSM netmod does release vbufs in large RMA data transferring

Min Si msi at anl.gov
Fri Sep 2 13:57:38 EDT 2016


It would also be good to check internal resource leak by using the 
--enable-g configure option for future release. This can report most MPI 
internal resource leak such as unreleased Request object at MPI_Finalize.

Min

On 9/2/16 11:52 AM, Mingzhe Li wrote:
> Hi Min,
>
> Thank you for the detailed information. We will come up with a right 
> fix and make it available with the next release.
>
> Thanks,
> Mingzhe
>
> On Fri, Sep 2, 2016 at 12:36 PM, Min Si <msi at anl.gov 
> <mailto:msi at anl.gov>> wrote:
>
>     Hi Mingzhe,
>
>     After discussing with the MPICH group, we think the fix patch for
>     Put/Acc part is hacky.
>
>     - It should use *MPID_Request_release* instead of
>     *MPIU_Object_release_ref*, because the ref_count is supposed to be
>     a "private" variable for the request allocation code, and should
>     not be directly updated anywhere. In fact, the latest MPICH
>     completely removed MPIU_Object_release_ref. For the Put/Acc rndv
>     message, the released ref_count can be considered as the reference
>     for psm_1sided_putpkt/psm_1sided_accumpkt, so we can *release* it
>     after work done in these function.
>
>     - Above is only the minimum code change to make it work, but is
>     still not the right approach. Because for rndv Put/Acc, the
>     request is always allocated with ref_count 2, but is immediately
>     reduced to 1 which adds unnecessary instructions. The right way to
>     fix this would be to create a request with only ref_count of 1.
>     But that needs change in psm_create_req, or another request
>     allocation function.
>
>     I cannot find time to make the right patch for it , but could you
>     please fix this in the right way before adding it into a future
>     release of MVAPICH ?
>
>     Thanks,
>     Min
>
>
>     On 9/1/16 6:56 PM, Mingzhe Li wrote:
>>     Hi Min,
>>
>>     Thank you for your detailed analysis and the patch. We will take
>>     the patch and it will be available with the next release.
>>
>>     Thanks,
>>     Mingzhe
>>
>>     On Thu, Sep 1, 2016 at 6:36 PM, Min Si <msi at anl.gov
>>     <mailto:msi at anl.gov>> wrote:
>>
>>         Hi,
>>
>>         I have observed heavy memory consumption in the PSM netmod
>>         when doing RMA communication with large data.
>>
>>         After looked into the code of the PSM netmod, I found it is
>>         because the first request in *rndv* protocol is not really
>>         released after data transferring completed. For example, in a
>>         rndv PUT, the first request is for packet header, and the
>>         second is for rndv data (see function psm_1sided_putpkt). The
>>         second request can be released in rma_list_gc in
>>         ch3u_rma_sync.c, but the first one is not exposed to CH3 and
>>         cannot be exactly released in psm_process_completion, because
>>         the ref_count is not 0.
>>
>>         Consequently, the vbuf allocated for the first request cannot
>>         be freed. Once the available vbufs in the pool are used up,
>>         new vbufs will be allocated (64 * 16KB). That is the reason I
>>         observed very heavy memory usage in osu_put_bw/osu_get_bw
>>         benchmarks, where every message size executes 64 times and
>>         thus the next message size always reallocates 64*loop new
>>         vbufs if it goes into rndv protocol (>16KB).
>>
>>         I have attached a patch based on MVAPICH2-2.2rc1 to fix this
>>         issue in PUT/ACC/GET/GET_ACC.
>>         - For Put/Acc, I think the ref_count should be decreased to 1
>>         in rndv branch, since only PSM layer checks it. Therefore it
>>         can be released in function psm_process_completion.
>>         - For Get/Get_Acc, I think the first request needs to be
>>         completed in psm_getresp_rndv_complete (ref_count--, and
>>         completion counter=0), thus it can be correctly released in
>>         CH3 function rma_list_gc.
>>
>>         Thanks,
>>         Min
>>
>>         _______________________________________________
>>         mvapich-discuss mailing list
>>         mvapich-discuss at cse.ohio-state.edu
>>         <mailto:mvapich-discuss at cse.ohio-state.edu>
>>         http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>         <http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss>
>>
>>
>
>
>     _______________________________________________
>     mvapich-discuss mailing list
>     mvapich-discuss at cse.ohio-state.edu
>     <mailto:mvapich-discuss at cse.ohio-state.edu>
>     http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>     <http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160902/dc18e0bf/attachment.html>


More information about the mvapich-discuss mailing list