[mvapich-discuss] vbuf pool allocation failure

Mon Feb 11 16:11:54 EST 2013

Great;  thank you!!

On Mon, Feb 11, 2013 at 3:58 PM, Devendar Bureddy
<bureddy at cse.ohio-state.edu> wrote:
> Hi Adam
>
> Good to know that it fixed your issue. I don't have any particular
> reason for altering log_mtts_per_seg. I guess changing either
> parameter should be fine.  The formula to compute the limit is some
> thing like
>
> 2^log_num_mtt  x  2^log_mtts_per_seg * x PAGE_SIZE
>
> I think you should be fine with log_num_mtt = 24 on your 128 GB systems.
>
> -Devendar
> On Mon, Feb 11, 2013 at 3:20 PM, Adam Coates <acoates at cs.stanford.edu> wrote:
>> Hi Devendar,
>>
>> Thanks for the reply;  we might have it fixed over here, but just to
>> sanity check (and to populate your discussion list for posterity):
>>
>> Some outputs from us:
>>
>> $ mpiname -a
>> MVAPICH2 1.9a2 Thu Nov  8 11:43:52 EST 2012 ch3:mrail
>> ..<snip>..
>> Configuration
>> --with-cuda=/usr/local/cuda-5.0 --enable-cuda
>>
>> $ ulimit -l
>> unlimited
>> [We're not using PBS, etc.;  so that should be good.]
>>
>> Following other notes on the list + your suggestion, we set
>> log_num_mtt=24 for the mlx4_core module and reloaded the driver, which
>> appears to have fixed our issue.
>>
>> Is there a reason to prefer altering log_mtts_per_seg?  The current value is:
>> $ more /sys/module/mlx4_core/parameters/log_mtts_per_seg
>> 0
>>
>> which I assume means it's defaulting to 3.  Our nodes have 128GB of
>> memory, and 4 GPUs (16GB GPU memory total), so my guess is that the
>> factor of 16 increase gained by altering log_num_mtt does not
>> completely fix the issue.
>>
>> Thanks a lot for your help.
>>
>> Best,
>> Adam
>>
>> On Mon, Feb 11, 2013 at 2:46 PM, Devendar Bureddy
>> <bureddy at cse.ohio-state.edu> wrote:
>>> Hi Brody
>>>
>>> It seems, it is hitting a limit on amount of memory that can be
>>> registered with HCA.  Can you provide following details?
>>>
>>> - Is lockable memory set to unlimited on compute nodes?
>>> $ ulimit -l
>>> unlimited
>>>
>>> - How much RAM these nodes have? Can you check OFED parameter
>>> log_mtts_per_seg. With most of the standard ofed installations,
>>> default value of this parameter is '3'.  If your system has more then
>>> 16GB, you need to set this parameter to '4'  or more.
>>>
>>> $ more /sys/module/mlx4_core/parameters/log_mtts_per_seg
>>> 3
>>>
>>> - What is the size of cudaHostRegister() buffer which you mentioned?
>>>
>>> - What version of MVAPICH2 you are using? and configuration options ?
>>>
>>> -Devendar
>>>
>>> On Mon, Feb 11, 2013 at 2:01 PM, Brody Huval <brodyh at stanford.edu> wrote:
>>>> Hi,
>>>>
>>>> Our job is running on 64 GPUs (64 MPI nodes) in a small cluster with
>>>> ConnectX3 IB adapters.  We've been running into abort() calls that
>>>> bring down the system after perhaps 10 or 15 minutes of running with
>>>> the following error:
>>>>
>>>> [src/mpid/ch3/channels/mrail/src/gen2/vbuf.c 540] Cannot register vbuf region
>>>> [8] Abort: vbuf pool allocation failed at line 607 in file
>>>> src/mpid/ch3/channels/mrail/src/gen2/vbuf.c
>>>>
>>>> Unfortunately, MV2_DEBUG_SHOW_BACKTRACE hasn't show us anything useful
>>>> here, so we're still hunting for the call that's triggering this.  We
>>>> have tried a suggested solution from the archives, setting
>>>> MV2_USE_LAZY_MEM_UNREGISTER to 0, but this leads to an immediate
>>>> crash.  Our code does not make significant use of pinned memory,
>>>> though in the one place that we do use it, it is done with
>>>> cudaHostRegister(), and this buffer is not touched by MPI.
>>>>
>>>> This problem cropped up just recently as we've moved to larger problem
>>>> sizes (and thus larger message sizes).  Previous runs with smaller
>>>> models have worked just fine.
>>>>
>>>> Do you have advice on how to find the problem, or a possible solution? Thank you in advance for any help.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>>
>>>
>>> --
>>> Devendar
>
>
>
> --
> Devendar