[mvapich-discuss] Are mallopt calls needed here

Mark Beall mbeall at simmetrix.com
Thu Jan 3 09:32:19 EST 2019


Hi Hari,

We will discuss that here, but I don’t think we have a good way to reproduce the problem (except in our application). Do you have any suggestions on how we might at least gather some information that would be helpful for you?

We ran into the exact same problem with older versions of openmpi which also used ptmalloc and turning that off also resolved the problem (although I don’t know if the flag that turns that off also turns off openmpi’s equivalent of your registration cache). As an aside, we ended up using mvapich since openmpi would then get all of the processes stuck in a wait for hours until it finally decided that they should continue running. 

Based on the stack traces of the process where it was stuck (which I can’t find right now, but I will send later), it was getting stuck in actual ptmalloc code, so our best guess is that ptmalloc doesn’t like the memory usage patterns of our application (which are probably quite different than most MPI applications)

Unfortunately, building with  -–disable-registration-cache also caused the application to reliably fail when we ran more than 1 process per node. This didn’t happen without the flag (nor with openmpi). The error was always along the lines of:
[cc067:mpi_rank_62][handle_cqe] Send desc error in msg to 53, wc_opcode=0
[cc067:mpi_rank_62][handle_cqe] Msg from 53: wc.status=12, wc.wr_id=0x14d208b0, wc.opcode=0, vbuf->phead->type=0 = MPIDI_CH3_PKT_EAGER
_SEND
[cc067:mpi_rank_62][handle_cqe] src/mpid/ch3/channels/mrail/src/gen2/ibv_channel_manager.c:548: [] Got completion with error 12, vendo
r code=0x81, dest rank=53
: No such file or directory (2)
[cli_62]: aborting job:

I will see if we have any more info we can give you. 

Thanks,

mark


> On Jan 3, 2019, at 8:25 AM, Subramoni, Hari <subramoni.1 at osu.edu> wrote:
> 
> Hi, Mark.
> 
> No, the mallopt calls are not strictly necessary here.
> 
> However, the issue with registration cache is quite surprising. We have not had any of our users report such a behavior before. Would it be possible to have a reproducer so that we can attempt to reproduce it locally? We would like to identify and fix the issue.
> 
> Thx,
> Hari.
> 
> -----Original Message-----
> From: mvapich-discuss <mvapich-discuss-bounces at cse.ohio-state.edu> On Behalf Of Mark Beall
> Sent: Tuesday, January 1, 2019 10:34 AM
> To: mvapich-discuss at cse.ohio-state.edu <mvapich-discuss at mailman.cse.ohio-state.edu>
> Subject: [mvapich-discuss] Are mallopt calls needed here
> 
> Hi,
> 
> In ch3_init.c (and similarly in ib_init.c) there is the following code:
> 
> #if !defined(DISABLE_PTMALLOC)
>        if (mvapich2_minit()) {
>            if (pg_rank == 0) {
>                MPL_error_printf("WARNING: Error in initializing MVAPICH2 ptmalloc library."
>                "Continuing without InfiniBand registration cache support.\n");
>            }
>            mv2_MPIDI_CH3I_RDMA_Process.has_lazy_mem_unregister = 0;
>        }
> #else /* !defined(DISABLE_PTMALLOC) */
>        mallopt(M_TRIM_THRESHOLD, -1);
>        mallopt(M_MMAP_MAX, 0);
>        mv2_MPIDI_CH3I_RDMA_Process.has_lazy_mem_unregister = 0; #endif /* !defined(DISABLE_PTMALLOC) */
> 
> 
> The question is whether it is ok to not have the mallopt() calls there. 
> 
> Presumably everything works without them since mallopt() is not called if mvapich2_minit() returns an error.
> 
> We must build with -–disable-registration-cache to disable ptmalloc since it causes one part of our code to take 10 hours vs 30 seconds (I’m not exaggerating).
> 
> We would also rather that the mallopt call wasn’t here since it causes undesirable behavior for our application. 
> 
> I’m guessing that at one point, DISABLE_PTMALLOC didn’t also imply disabling the registration cache (and then would need the mallopt() calls), now it does (since it sets has_lazy_mem_unregister = 0) and thus they really aren’t needed anymore.
> 
> Thanks,
> 
> mark
> 
> 
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20190103/ca441ebd/attachment-0001.html>


More information about the mvapich-discuss mailing list