[mvapich-discuss] MV2_USE_LAZY_MEM_UNREGISTER and memory usage?

Xie Min xmxmxie at gmail.com
Mon Feb 9 09:20:03 EST 2009


The hpcc we used is HPCC 1.0.0, but we just tried HPCC 1.3.1, seems
has the same problem.

In the attachment we attached two hpccinf.txt files for 64 HPCC tasks,
the hpccinf.txt.13 is the "RES" of about 1.3GB, while hpccinf.txt.16
is the "RES" of about 1.6/1.7GB. Whould you please try them on your
systems (with MV2_USE_LAZY_MEM_UNREGISTER=1), thanks.

BTW, the OFED version we used is 1.3.1, physical memory on each node
is 16GB, use 8 nodes for 64 tasks.



2009/2/7 Matthew Koop <koop at cse.ohio-state.edu>:
>
> Thanks for the additional information. I've tried here with HPCC 1.3.1 and
> I haven't been able to see any difference in the 'RES' or 'VIRT' memory
> while running.
>
> Would it be possible to send me your hpccinf.txt file so I can more
> closely try to reproduce the problem? We also have AS5 with kernel 2.6.18
> as well.
>
> Thanks,
>
> Matt
>
> On Thu, 5 Feb 2009, Xie Min wrote:
>
>> We use Redhat AS5, kernel is 2.6.18 with lustre 1.6.6, and we don't
>> modify kernel source.
>>
>> We test HPCC on two clusters:
>> In one cluster, each node is booted using Boot over IB, it has no
>> harddisk, so NO swap space. We run 64 HPCC tasks on 8 nodes (so each
>> CPU core in the node will run one HPCC task), when each HPCC task use
>> 1.2/1.3G memory, it will be killed by OS because of "Out of memory"
>> error. But when MV2_USE_LAZY_MEM_UNREGISTER=0, task can use 1.7G
>> memory and run successfully.
>>
>> In another cluster, each node has harddisk, it booted from local disk,
>> and it HAS space space. We run 64 HPCC tasks on 8 nodes too. When each
>> HPCC use 1.3G memory, we use "top" to show the memory usage
>> information, we found swap will be used when HPCC is running for a
>> while, and the node begin to run very slowly and cannot respond to
>> keyboard input. But when MV2_USE_LAZY_MEM_UNREGISTER=0, each task can
>> be set to 1.7G memory scale and run successfully.
>>
>> I tried another mvapich2 parameters: MV2_USE_LAZY_MEM_UNREGISTER=1,
>> and MV2_NDREG_ENTRIES=8. In this configuration, HPCC is still be
>> killed by OS with "Out of memory" error when the memory scale of each
>> task is set to 1.3GB.
>>
>> 2009/2/5 Matthew Koop <koop at cse.ohio-state.edu>:
>> > Hi,
>> >
>> > What OS/distro are you running? Are there any changes you made, such as
>> > page size, etc from the base?
>> >
>> > I'm taking a look at this issue on our machine as well, although I'm not
>> > seeing the memory change that you reported.
>> >
>> > Matt
>> >
>> >
>>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hpccinf.txt.13
Type: application/octet-stream
Size: 1429 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20090209/c3aaf3d2/hpccinf.txt.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hpccinf.txt.16
Type: application/octet-stream
Size: 1429 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20090209/c3aaf3d2/hpccinf.txt-0001.obj


More information about the mvapich-discuss mailing list