[mvapich-discuss] Performance of MPI_Iallgatherv
Zehan Cui
zehan.cui at gmail.com
Wed Apr 2 02:55:06 EDT 2014
Dear Akshay,
Thanks for your reply. Here I provide the source code of my test programs.
The input and output of Allgatherv is:
[cmy at gnode102 test_nbc]$ mpirun -np 2 -hosts gnode102,gnode103 ./Allgatherv
128 2
rank[1] success
rank[0] success
Number of processes: 2
Computation time : 8479068 us
Communication time: 399833 us
The input and output of Iallgatherv is:
[cmy at gnode102 test_nbc]$ mpirun -np 2 -hosts gnode102,gnode103
./Iallgatherv 128 2
rank[1] success
rank[0] success
Number of processes: 2
Computation time : 8482080 us
Communication time: 366492 us
Wait time: 450935 us
The system configurations are:
OS: CentOS 5.3
kernel: 2.6.18-128.e15
gcc: 4.6.1
Infiniband: 2x 40Gbps ports
mpi configuration:
[cmy at gnode102 test_nbc]$ mpirun -info
HYDRA build details:
Version: 3.1
Release Date: Sun Mar 23 21:35:26 EDT 2014
CC: gcc
CXX: g++
F77: no
F90: no
Configure options: '--disable-option-checking'
'--prefix=/home3/cmy/czh/opt/mvapich2-2.0rc1' '--disable-f77'
'--disable-fc' '--enable-embedded-mode' '--cache-file=/dev/null'
'--srcdir=.' 'CC=gcc' 'CFLAGS= -DNDEBUG -DNVALGRIND -O2' 'LDFLAGS= '
'LIBS=-lm -libmad -lrdmacm -libumad -libverbs -ldl -lrt -lpthread '
'CPPFLAGS= -I/home3/cmy/czh/tools/mvapich2-2.0rc1/src/mpl/include
-I/home3/cmy/czh/tools/mvapich2-2.0rc1/src/mpl/include
-I/home3/cmy/czh/tools/mvapich2-2.0rc1/src/openpa/src
-I/home3/cmy/czh/tools/mvapich2-2.0rc1/src/openpa/src
-I/home3/cmy/czh/tools/mvapich2-2.0rc1/src/mpi/romio/include'
Process Manager: pmi
Launchers available: ssh rsh fork slurm ll lsf sge
manual persist
Topology libraries available: hwloc
Resource management kernels available: user slurm ll lsf sge pbs
cobalt
Checkpointing libraries available:
Demux engines available: poll select
Anything else that I need to provide?
Thanks,
Zehan
On Wed, Apr 2, 2014 at 4:43 AM, Akshay Venkatesh
<akshay at cse.ohio-state.edu>wrote:
> Zehan,
>
> Can you provide a reproducer for your program (with relevant input
> parameters) where you see the degradation with MPI_Iallgatherv() use? It'd
> be helpful if you could provide details of the system and the configuration
> (compilers used and MVAPICH2 library configuration flags) you used to run
> the program.
>
> Thanks
>
>
>>
>> ---------- Forwarded message ----------
>> From: Zehan Cui <zehan.cui at gmail.com>
>> Date: Tue, Apr 1, 2014 at 3:28 AM
>> Subject: [mvapich-discuss] Performance of MPI_Iallgatherv
>> To: mvapich-discuss at cse.ohio-state.edu
>>
>>
>> Hi,
>>
>> I'm testing the non-blocking collective of MVAPICH2-2.0rc1.
>>
>> I have two nodes with Infiniband to perform allgather on totally 128MB
>> data.
>>
>> I split the 128MB data into eight pieces, and perform computation and
>> MPI_Iallgatherv() on one piece of data each iteration, hoping that the
>> MPI_Iallgatherv() of last iteration can be overlapped with computation of
>> current iteration. A MPI_Wait() is called at the end of last iteration.
>>
>> However, the total communication time (including the final wait time)
>> even exceeds the traditional blocking MPI_Allgatherv.
>>
>>
>> Following is the test pseudo-code and result.
>>
>> ===========================
>>
>> Using MPI_Allgatherv:
>>
>> for( i=0; i<8; i++ )
>> {
>> // computation
>> mytime( t_begin );
>> computation;
>> mytime( t_end );
>> comp_time += (t_end - t_begin);
>>
>> // communication
>> t_begin = t_end;
>> MPI_Allgatherv();
>> mytime( t_end );
>> comm_time += (t_end - t_begin);
>> }
>>
>> result:
>> comp_time = 7,454,219 us
>> comm_time = 399,854 us
>>
>> --------------------------------------------
>>
>> Using MPI_Iallgatherv:
>>
>> for( i=0; i<8; i++ )
>> {
>> // computation
>> mytime( t_begin );
>> computation;
>> mytime( t_end );
>> comp_time += (t_end - t_begin);
>>
>> // communication
>> t_begin = t_end;
>> MPI_Iallgatherv();
>> mytime( t_end );
>> comm_time += (t_end - t_begin);
>> }
>>
>> // wait for non-blocking allgather to complete
>> mytime( t_begin );
>> for( i=0; i<8; i++ )
>> MPI_Wait;
>> mytime( t_end );
>> wait_time = t_end - t_begin;
>>
>> result:
>> comp_time = 7,453,938 us
>> comm_time = 365,511 us
>> wait_time = 453,132 us
>>
>> ==============================
>>
>> It seems that MPI_Iallgatherv() is more blocking than MPI_Allgather().
>>
>> Does anyone have the same experience?
>>
>>
>> Thanks,
>> Zehan
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>>
>
>
> --
> - Akshay
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140402/090cb8f7/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Iallgatherv.c
Type: text/x-csrc
Size: 3770 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140402/090cb8f7/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Allgatherv.c
Type: text/x-csrc
Size: 3347 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140402/090cb8f7/attachment-0003.bin>
More information about the mvapich-discuss
mailing list