[mvapich-discuss] Same nodes different time?

Daniel WEI lakeat at gmail.com
Thu Mar 27 21:45:32 EDT 2014


Hari,


I think I have already done this, that is in the same case folder,
Use SGE, submit the job script, and after it finished, submit it again.
Note, the number of CPU can access is fixed in my school, so no other guy
is using them. The job sent is through mpirun -hostfile hosts -np 320
blahblah

The file written is only once, it is at the end of simulation so this does
not affect my time measurement. The file input reading is only at the
beginning, also once.





Zhigang Wei
----------------------
*University of Notre Dame*


On Thu, Mar 27, 2014 at 9:34 PM, Hari Subramoni <subramoni.1 at osu.edu> wrote:

> Hello Daniel,
>
> Can you try two back to back runs on the same set of hosts and see if
> there is any variance in performance. To be clear, this is what I mean
>
> If interactive mode
> --------------------------
> 1. Request for a bunch of nodes from the schedules
> 2. Run application; Note time
> 3. Run application; Note time
>
> In batch mode
> ---------------------
> Create a shell script which runs the application twice like the one below
> and submit it to SGE
>
> #!/bin/bash
> for i in `seq 1 2`
> do
>     Run App; Store time;
> done
>
> You mention "reading in of the velocity field and pressure field". Does
> this involve any file system operation (like reading a file, writing to a
> file etc)? If you're touching the file system, the performance can vary
> wifely and this does not have anything to do with the MPI library.
>
> Regards,
> Hari.
>
>
> On Thu, Mar 27, 2014 at 8:56 PM, Daniel WEI <lakeat at gmail.com> wrote:
>
>> Measurement is implemented in my c++ code, using "sys/times.h", for
>> example:
>>
>> start = clock();
>> ... /* Do the work. */
>> end = clock();
>> elapsed = ((double) (end - start)) / CLOCKS_PER_SEC;
>>
>> I have tried both 5~20 minutes job, as well as 0.5~3 hours job, they all
>> show differences. Let's say the first is JOB-A, the latter is JOB-B.
>> At first I was testing JOB-B, and since I found there is difference, even
>> though the hosts are the same (just the order of hosts is different). So I
>> then started to test on a smaller job, that is JOB-A today, and I fixed the
>> order of hosts by manually create a hostfile, and then I found even with
>> the same order of hosts, the results are still different.
>>
>> I don't understand what did you mean by saying "warm up"
>> "startup/wrapup", etc. In my case, the "reading in" of the velocity field
>> and pressure field sometimes could be occasionally huge different (37
>> seconds in one case, 3 seconds in another case).
>>
>> I guess Tony's point makes sense, that the problem is in switches. But I
>> am not sure.
>>
>>
>>
>>
>>
>> Zhigang Wei
>> ----------------------
>> *University of Notre Dame*
>>
>>
>> On Thu, Mar 27, 2014 at 7:58 PM, Gus Correa <gus at ldeo.columbia.edu>wrote:
>>
>>> On 03/27/2014 05:58 PM, Daniel WEI wrote:
>>>
>>>>
>>>> On Thu, Mar 27, 2014 at 5:45 PM, Tony Ladd <tladd at che.ufl.edu
>>>> <mailto:tladd at che.ufl.edu>> wrote:
>>>>
>>>>     So your performance can vary depending on what else is going on with
>>>>     the other nodes in the system
>>>>
>>>>
>>>> Thank you Tony. I see.
>>>>
>>>> (1) But how much variance?! My results shows some very disturbing
>>>> difference, on one case, to initialize the case, it takes 37s, another
>>>> 5s, yet another 2s!!!
>>>> (2) How can I do my best, or somebody else to do their best, in order to
>>>> reduce this variance? (there is 16 cores/node, so there should be nobody
>>>> using the nodes I was calling, this seems to be guaranteed)
>>>> (3) I goal is to compare intel compiler's -O3 and -O2 difference on
>>>> building my CFD code concerning speed, but now if my performance vary
>>>> even in the same case, same hosts, how can I trust my results
>>>> anymore....?
>>>> Zhigang Wei
>>>> ----------------------
>>>> /University of Notre Dame/
>>>>
>>>>
>>> Hi Zhigang
>>>
>>> What time are you measuring?
>>> Wall time from the job scheduler for the whole job?
>>> Wall time for the application only (say with Unix time utility or
>>> MPI_Wtime)?
>>> Something else?
>>>
>>> Have you tried to run your test simulations for a longer time (several
>>> minutes, one hour perhaps, not just a few seconds)
>>> to see if the outcome shows less spread?
>>> Say, you could change the number of time steps to 100x
>>> or perhaps 10,000x what you are currently using,
>>> depending of course on the max walltime allowed by your cluster queue.
>>>
>>> My wild guess is that with short-lived simulations
>>> what may count most is the job or application
>>> startup and wrapup times, which may vary significantly in a cluster,
>>> specially in a big cluster, overwhelming and obscuring your program
>>> execution time.
>>> Most MPI and benchmark implementations recommend
>>> that you "warm up" your own tests/benchmarks
>>> for a time long enough to reduce such startup/wrapup effects.
>>>
>>> My two cents,
>>> Gus Correa
>>>
>>>
>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140327/459b1737/attachment-0001.html>


More information about the mvapich-discuss mailing list