[mvapich-discuss] MVAPICH2 + BLCR performance problem on multi-core cluster

Mon Apr 7 11:57:01 EDT 2008

Hello,

I have a performance problem when using mvapich2 compiled with BLCR support
on infiniband cluster with following parameters:

Node: 2xQuad Core Intel Xeon 2.33 GHz
O/S: RHEL4.5
File System: GPFS
We are using MVAPICH2-1.0.2p1 with BLCR-0.6.5.

I've done 3 test runs of my program using 8 MPI processes:
1) All of 8 processes on one node
2) by 4 processes on two nodes
3) by 2 processes on 4 nodes

*Results MVAPICH2 configured for BLCR support:*
*[ccs-dev at n5304]$ mpiexec -machinefile ./mf1 -np 8 ./test*

*Calc time: 341.3279, send/recv time = 297.817*
*[ccs-dev at n5304]$ mpiexec -machinefile ./mf2 -np 8 ./test*
*Calc time: 85.7075, send/recv time = 42.2270*
*[ccs-dev at n5304]$ mpiexec -machinefile ./mf3 -np 8 ./test*

*Calc time: 84.6182, send/recv time = 40.3554*

*Results MVAPICH2 configured without BLCR support:*
*[ccs-dev at n5304]$ mpiexec -machinefile ./mf1 -np 8 ./test*
*Calc time: 51.5888, send/recv time = 8.0186*

*[ccs-dev at n5304]$ mpiexec -machinefile ./mf2 -np 8 ./test*
*Calc time: 53.6679, send/recv time = 10.1187*
*[ccs-dev at n5304]$ mpiexec -machinefile ./mf3 -np 8 ./test*
*Calc time: 63.6611, send/recv time = 20.0127

*

So when using MVAPICH2 configured with BLCR support I have much
time which is spent on communication between processes.
Is it concerned with the fact of shared-memory support automatic disabling
in such build?
If it is so, do you plan to include support of both BLCR & shared-memory
communications in future releases?
And maybe there are another ways to improve performance of MPI program
running on multi-core node?

Thanks.

Maya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080407/6d065973/attachment.html