[mvapich-discuss] The cost of MPI_Fence()?

wei huang huanwei at cse.ohio-state.edu
Wed Mar 15 12:52:49 EST 2006


Hi Guangming,

Thanks for your interest in using MVAPICH2.

However, to clearly answer your questions, we would like to know more
about your program. Please see below for our questions:

1) Have you compiled with ONE_SIDED flag? This flag is expected to bring
the best communication/computation overlap. Also, which device is used,
vapi, gen2 or udapl?

> I'm implementing an algorithm using MVAPICH2 0.9.2 on Infiniband. The
> program is executed as follows:
> for (i = 0; i < loops; i++) {
> 	MPI_Fence;
------------------------ maybe moved outside?
> 	for (dest = 0; dest < p; dest++)
> 		if (id != dest)
> 			MPI_Put;
> 	computation;
> 	MPI_Fence;
> }

2) How many processes are involved in your fence operations?

3) Do you have any estimation on how much time your computing phase takes?
If you can give an estimate in micro-seconds, that will be great. Also,
do all the processes have the same amount of computation?

4) A small suggestion is that you don't need two fences in one loop. You
can probably move the first fence before the loop without affecting the
correctness.

> The time measurement shows that the most of communication time(98%) is
> consumed by the synchronization MPI_Fence. Each message size is larger than
> 100KB.
> How to optimizing the cost of synchronization?

5) We would like to know how do you measure the percentage. Could you give
us more information on your profiling?

Again, thanks for your interest.

-- Wei




More information about the mvapich-discuss mailing list