[mvapich-discuss] Faster is Slower
Sayantan Sur
surs at cse.ohio-state.edu
Mon Aug 28 12:35:19 EDT 2006
Greetings Norm,
Taylor, Norm R. wrote:
>Hi,
> I'm wondering if you can advise me on an issue I'm encountering with
>MPI+Infiniband, using MVAPICH. I'm finding that the high rate at which
>collective operations - e.g. MPI_GATHER - poll to determine if all nodes
>have entered the operation steals too many CPU cycles from other
>processes, slowing down overall performance. Is there a way I can tune
>these operations to be more CPU-efficient? I actually improve
>performance by adding a few microseconds of sleep time to the data
>transfer processes (these are the ones using MPI+Infiniband) to give
>more CPU cycles to the computational processes. This tuning is very
>specific to the problem at hand and the number of nodes in use. Tuning
>at the process level seems still inefficient - it would be better if the
>sleep time was applied inside the collective operations. Is there a way
>I can set a parameter somewhere to make that happen?
>
>
Thanks for bringing this up on the list. Infact, we have thought about
this very situation, where one process is busy polling for considerable
duration and steals CPU cycles from other "useful" processes.
MVAPICH has a mode (called BLOCKING_SUPPORT), using which a MPI process
will not busy poll indefinitely, rather yield the CPU to other
processes. The user can further fine tune the "spin-block" threshold to
get the best CPU usage/message latency tradeoff for any specific
application.
In order to activate this mode, you can follow the instructions given in
Section 4.4.1 in our user guide. Look under the bullet "Customize
MVAPICH configuration" -> Blocking Progress.
http://nowlab.cse.ohio-state.edu/projects/mpi-iba/mvapich_user_guide.html#x1-100004.4.1
In order to further fine tune your application, you can adjust the spin
count (after which the process yields the CPU) using the environment
variable VIADEV_MAX_SPIN_COUNT.
http://nowlab.cse.ohio-state.edu/projects/mpi-iba/mvapich_user_guide.html#x1-860009.30
For example, a latency sensitive application could set this parameter
high (like 20,000-30,000) so that the application yields CPU less often.
On the other hand, a application which wants to yield the CPU as often
as possible can set this parameter to be low (like 20-30).
Please let us know if this answers your question.
Thanks,
Sayantan.
--
http://www.cse.ohio-state.edu/~surs
More information about the mvapich-discuss
mailing list