[mvapich-discuss] [../src/mpid/ch3/channels/mrail/src/gen2/vbuf.c 397] Cannot register vbuf region
Jeff Hammond
jeff.science at gmail.com
Tue Dec 17 08:49:28 EST 2013
Now all my jobs die with this error:
[0->108] send desc error, wc_opcode=0
[0->108] wc.status=12, wc.wr_id=0x67084770, wc.opcode=0,
vbuf->phead->type=17 = MPIDI_CH3_PKT_GET_RNDV
[vs86:mpi_rank_330][MPIDI_CH3I_MRAILI_Cq_poll]
../src/mpid/ch3/channels/mrail/src/gen2/ibv_channel_manager.c:621: []
Got completion with error 12, vendor code=0x81, dest rank=108
: No such file or directory (2)
Given that all the errors are annotated with [0->N] for some N, I have
a pretty good idea what is causing this in the application (everyone
blasts rank 0 with MPI_Fetch_and_op). Any suggestions on how to fix
this? Is there a way to throttle the RMW storm? It appears to be
doing a rendezvous already but if there is a way to turn on additional
send-side flow-control, that might help.
Jeff
On Mon, Dec 16, 2013 at 12:19 AM, Deva <devendar.bureddy at gmail.com> wrote:
> Can you try MV2_USE_LAZY_MEM_UNREGISTER=0 ? This should reduce memory
> registration overhead to some extent.
>
> -Devendar
>
>
> On Sun, Dec 15, 2013 at 6:52 PM, Jeff Hammond <jeff.science at gmail.com>
> wrote:
>>
>> So there's nothing I can do in userspace? I've requested the
>> sysadmins change the IB settings, but since the machine I'm using
>> shares its IB network with the GFPS servers for Mira
>> [https://www.alcf.anl.gov/mira], they might balk at it.
>>
>> Jeff
>>
>> On Sun, Dec 15, 2013 at 3:42 PM, Deva <devendar.bureddy at gmail.com> wrote:
>> > Jeff,
>> >
>> > This could be related to OFED memory registration limits( log_num_mtt,
>> > log_mtts_per_seg). Similar issue was discussed here :
>> >
>> > http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/2013-February/004261.html.
>> > Can you verify this solution?
>> >
>> > Few details from user guide on these OFED parameters:
>> >
>> > http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-2.0b.html#x1-1130009.1.1
>> >
>> >
>> > -Devendar
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Sun, Dec 15, 2013 at 9:39 AM, Jeff Hammond <jeff.science at gmail.com>
>> > wrote:
>> >>
>> >> I am running NWChem using ARMCI over MPI-3 RMA
>> >> [http://git.mpich.org/armci-mpi.git/shortlog/refs/heads/mpi3rma]. Two
>> >> attempts to run a relatively large job failed as follows:
>> >>
>> >> [../src/mpid/ch3/channels/mrail/src/gen2/vbuf.c 397] Cannot register
>> >> vbuf
>> >> region
>> >> [vs9:mpi_rank_350][get_vbuf]
>> >> ../src/mpid/ch3/channels/mrail/src/gen2/vbuf.c:798: VBUF reagion
>> >> allocation failed. Pool size 640
>> >> : Cannot allocate memory (12)
>> >>
>> >> [../src/mpid/ch3/channels/mrail/src/gen2/vbuf.c 397] Cannot register
>> >> vbuf
>> >> region
>> >> [vs28:mpi_rank_8][get_vbuf]
>> >> ../src/mpid/ch3/channels/mrail/src/gen2/vbuf.c:798: VBUF reagion
>> >> allocation failed. Pool size 4736
>> >> : Cannot allocate memory (12)
>> >>
>> >> NWChem is attempting to allocate a relatively large amount of memory
>> >> using MPI_Win_allocate, so it doesn't surprise me that this happens.
>> >> However, it is not entirely clear if the problem is that generic
>> >> memory allocation has failed, i.e. malloc (or equivalent) returned
>> >> NULL, or if something related to IB has been exhausted, e.g. ib_reg_mr
>> >> has failed.
>> >>
>> >> If this is not just a simple out-of-memory error, can you suggest
>> >> environment variables or source changes (in ARMCI-MPI, not MVAPICH2)
>> >> that might alleviate these problems? I don't know that the installed
>> >> Linux has large page support and I can't readily request a new OS
>> >> image, but I can switch machines if this is likely to have a positive
>> >> impact.
>> >>
>> >> These are the MVAPICH installation details:
>> >>
>> >> $ /home/jhammond/TUKEY/MPI/mv2-trunk-gcc/bin/mpichversion
>> >> MVAPICH2 Version: 2.0b
>> >> MVAPICH2 Release date: unreleased development copy
>> >> MVAPICH2 Device: ch3:mrail
>> >> MVAPICH2 configure: CC=gcc CXX=g++ --enable-fc FC=gfortran
>> >> --enable-f77 F77=gfortran --with-pm=hydra --enable-mcast
>> >> --enable-static --prefix=/home/jhammond/TUKEY/MPI/mv2-trunk-gcc
>> >> MVAPICH2 CC: gcc -DNDEBUG -DNVALGRIND -O2
>> >> MVAPICH2 CXX: g++ -DNDEBUG -DNVALGRIND -O2
>> >> MVAPICH2 F77: gfortran -O2
>> >> MVAPICH2 FC: gfortran -O2
>> >>
>> >> I looked at the code and it seems that there might be a way to fix
>> >> this, but obviously I'll have to wait for you all to do it.
>> >>
>> >> /*
>> >> * It will often be possible for higher layers to recover
>> >> * when no vbuf is available, but waiting for more descriptors
>> >> * to complete. For now, just abort.
>> >> */
>> >> if (NULL == free_vbuf_head)
>> >> {
>> >> if(allocate_vbuf_region(rdma_vbuf_secondary_pool_size) != 0) {
>> >> ibv_va_error_abort(GEN_EXIT_ERR,
>> >> "VBUF reagion allocation failed. Pool size %d\n",
>> >> vbuf_n_allocated);
>> >> }
>> >> }
>> >>
>> >> Thanks!
>> >>
>> >> Jeff
>> >>
>> >> --
>> >> Jeff Hammond
>> >> jeff.science at gmail.com
>> >>
>> >> _______________________________________________
>> >> mvapich-discuss mailing list
>> >> mvapich-discuss at cse.ohio-state.edu
>> >> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>> >
>> >
>> >
>> >
>> > --
>> >
>> >
>> > -Devendar
>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com
>
>
>
>
> --
>
>
> -Devendar
--
Jeff Hammond
jeff.science at gmail.com
More information about the mvapich-discuss
mailing list