[mvapich-discuss] (no subject)
Roland Fehrenbacher
Roland.Fehrenbacher at transtec.de
Thu Mar 23 13:04:27 EST 2006
>>>>> "Sayantan" == Sayantan Sur <surs at cse.ohio-state.edu> writes:
Sayantan> Hi, * On Mar,7 Troy Telford<ttelford at linuxnetworx.com>
Sayantan> wrote :
>> On Thu, 23 Mar 2006 02:13:27 -0700, Roland Fehrenbacher
>> <Roland.Fehrenbacher at transtec.de> wrote:
>>
>> > >> When running HPL, hpl.dat can contain multiple problem
>> sizes. > >> xhpl * reads the config file, and runs one problem
>> size (until > >> it completes the problem) * After the previous
>> problem is > >> finished, hpl will then start execution on the
>> next problem > >> size.
>> >
>> > Sayantan> Thanks for the explanation. I understand what you
>> are > Sayantan> saying.
>> >
>> >I'm facing the same problem here with xhpl and mvapich
>> 0.9.7. The strange thing is that this problem didn't happen
>> with versions at least up to 0.9.5 (haven't tested 0.9.6). Is
>> the mechanism mentioned below a performance optimization that
>> entered the code after 0.9.5? I >just checked, and I had set
>> -DLAZY_MEM_UNREGISTER in my 0.9.5 version >already.
>>
>> I've actually been able to replicate it with 0.9.5. (And even
>> 0.9.4)
Sayantan> Troy, thanks for verifying this behavior with 0.9.5 and
Sayantan> other previous releases. As mentioned in the previous
Sayantan> email, this is because of the way malloc needs to be
Sayantan> configured in order to cache registration entries.
Sayantan> Roland, this optimization has been in MVAPICH for quite
Sayantan> a long time (ever since the earliest releases). In our
Sayantan> recent release 0.9.7, the registration caching algorithm
Sayantan> was optimized and a user configurable limit on the
Sayantan> max. number of registered pages was provided (since
Sayantan> 0.9.6). However, the basic mechanism (ie. configuring
Sayantan> malloc) remains the same. Would you describe your
Sayantan> experimental setup when you saw a different behavior
Sayantan> with 0.9.5?
I have used 0.9.5 with IBGD 1.8.0, kernel 2.6.14, and definitely
didn't have this problem. We use xhpl regularly for stress testing on
many nodes.
Now I'm using 0.9.7 with IBGD 1.8.2, kernel 2.6.15.
Roland
More information about the mvapich-discuss
mailing list