[mvapich-discuss] Same nodes different time?

Thu Mar 27 20:56:31 EDT 2014

Measurement is implemented in my c++ code, using "sys/times.h", for example:

start = clock();
... /* Do the work. */
end = clock();
elapsed = ((double) (end - start)) / CLOCKS_PER_SEC;

I have tried both 5~20 minutes job, as well as 0.5~3 hours job, they all
show differences. Let's say the first is JOB-A, the latter is JOB-B.
At first I was testing JOB-B, and since I found there is difference, even
though the hosts are the same (just the order of hosts is different). So I
then started to test on a smaller job, that is JOB-A today, and I fixed the
order of hosts by manually create a hostfile, and then I found even with
the same order of hosts, the results are still different.

I don't understand what did you mean by saying "warm up" "startup/wrapup",
etc. In my case, the "reading in" of the velocity field and pressure field
sometimes could be occasionally huge different (37 seconds in one case, 3
seconds in another case).

I guess Tony's point makes sense, that the problem is in switches. But I am
not sure.

Zhigang Wei
----------------------
*University of Notre Dame*

On Thu, Mar 27, 2014 at 7:58 PM, Gus Correa <gus at ldeo.columbia.edu> wrote:

> On 03/27/2014 05:58 PM, Daniel WEI wrote:
>
>>
>> On Thu, Mar 27, 2014 at 5:45 PM, Tony Ladd <tladd at che.ufl.edu
>> <mailto:tladd at che.ufl.edu>> wrote:
>>
>>     So your performance can vary depending on what else is going on with
>>     the other nodes in the system
>>
>>
>> Thank you Tony. I see.
>>
>> (1) But how much variance?! My results shows some very disturbing
>> difference, on one case, to initialize the case, it takes 37s, another
>> 5s, yet another 2s!!!
>> (2) How can I do my best, or somebody else to do their best, in order to
>> reduce this variance? (there is 16 cores/node, so there should be nobody
>> using the nodes I was calling, this seems to be guaranteed)
>> (3) I goal is to compare intel compiler's -O3 and -O2 difference on
>> building my CFD code concerning speed, but now if my performance vary
>> even in the same case, same hosts, how can I trust my results anymore....?
>> Zhigang Wei
>> ----------------------
>> /University of Notre Dame/
>>
>>
> Hi Zhigang
>
> What time are you measuring?
> Wall time from the job scheduler for the whole job?
> Wall time for the application only (say with Unix time utility or
> MPI_Wtime)?
> Something else?
>
> Have you tried to run your test simulations for a longer time (several
> minutes, one hour perhaps, not just a few seconds)
> to see if the outcome shows less spread?
> Say, you could change the number of time steps to 100x
> or perhaps 10,000x what you are currently using,
> depending of course on the max walltime allowed by your cluster queue.
>
> My wild guess is that with short-lived simulations
> what may count most is the job or application
> startup and wrapup times, which may vary significantly in a cluster,
> specially in a big cluster, overwhelming and obscuring your program
> execution time.
> Most MPI and benchmark implementations recommend
> that you "warm up" your own tests/benchmarks
> for a time long enough to reduce such startup/wrapup effects.
>
> My two cents,
> Gus Correa
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140327/0a47fe23/attachment.html>