[mvapich-discuss] VBUF Abort reached in job

Devendar Bureddy bureddy at cse.ohio-state.edu
Thu May 9 16:14:53 EDT 2013


On Thu, May 9, 2013 at 4:03 PM, Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS
AND APPLICATIONS INC] <matthew.thompson at nasa.gov> wrote:

> On 05/09/2013 12:44 PM, Devendar Bureddy wrote:
>
>> Hi Matt
>>
>>  <snip>
>
>
>> - How much physical memory these nodes have?
>>
>
> 24 GB per node.
>
>
>  - Can you report following values on one of the compute node?
>>
>> $ cat /sys/module/mlx4_core/**parameters/log_mtts_per_seg
>>
>
> This reports as "3".
>
>  $ cat /sys/module/mlx4_core/**parameters/log_num_mtt
>>
>
> This reports as "0". Does this mean it defaults to something other than
> "0" or is max_reg_mem really 8*PAGE_SIZE?


Yes. it defaults to "20".  The default OFED settings should allow upto 16GB
memory to register with HCA. On your system, you can set this to 21 or 22.

-Devendar


>
>
>  I think this issue could be similar to one explained in the above list
>> thread.  OFED has parameters to limit the amount of memory that can be
>> registered with HCA. The following user guide FAQ entry has few more
>> details regarding this
>> http://mvapich.cse.ohio-state.**edu/support/user_guide_**
>> mvapich2-1.9.html#x1-1130009.**1.1<http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.9.html#x1-1130009.1.1>
>>
>
> Thanks, I'll forward this on to my colleagues here.
>
>
>> -Devendar
>>
>> --
>> Devendar
>>
>
>
> --
> Matt Thompson, PhD     SSAI, Sr Software Test Engr
> NASA GSFC, Global Modeling and Assimilation Office
> Code 610.1, 8800 Greenbelt Rd, Greenbelt, MD 20771
> Phone: 301-614-6712              Fax: 301-614-6246
>



-- 
Devendar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20130509/8d6f5cef/attachment-0001.html


More information about the mvapich-discuss mailing list