[mvapich-discuss] Re: How to check if MVAPICH is using IB
networkbut not ethernetwork?
Divi Venkateswarlu
divi at ncat.edu
Sat Jun 7 09:42:24 EDT 2008
One more to add:
OSU_BENCHMARKS are as follows:
[root at divilab osu_benchmarks]# mpirun_rsh -np 64 -hostfile ./host1 osu_alltoall
# OSU MPI All-to-All Personalized Exchange Latency Test v3.0
# Size Latency (us)
1 651.27
2 650.83
4 647.61
8 669.23
16 658.48
32 652.27
64 663.42
128 698.62
256 795.00
512 2298.14
1024 2894.17
2048 4510.67
4096 8636.66
8192 18353.73
16384 34942.05
32768 43737.38
65536 71297.06
131072 138771.72
262144 273233.08
524288 543174.56
1048576 1086598.00
Are these as expected?
----- Original Message -----
From: Divi Venkateswarlu
To: mvapich-discuss at cse.ohio-state.edu
Sent: Saturday, June 07, 2008 9:07 AM
Subject: [mvapich-discuss] Re: How to check if MVAPICH is using IB networkbut not ethernetwork?
one more to add:
My IB hardware 8-port Flextronics SDR switch
MHES18 Mallanox HCA cards
ibchecknet shows the following
[root at divilab bin]# ibchecknet
# Checking Ca: nodeguid 0x0002c90200244228
# Checking Ca: nodeguid 0x0002c90200244230
# Checking Ca: nodeguid 0x0002c902002740fc
# Checking Ca: nodeguid 0x0002c902002441a4
# Checking Ca: nodeguid 0x0002c902002441c4
# Checking Ca: nodeguid 0x0002c9020024422c
# Checking Ca: nodeguid 0x0002c902002441ac
# Checking Ca: nodeguid 0x0002c9020024418c
## Summary: 9 nodes checked, 0 bad nodes found
## 16 ports checked, 0 bad ports found
## 0 ports have errors beyond threshold
----- Original Message -----
From: Divi Venkateswarlu
To: mvapich-discuss at cse.ohio-state.edu
Sent: Saturday, June 07, 2008 9:02 AM
Subject: How to check if MVAPICH is using IB network but not ethernetwork?
Hello all:
Good morning!
I set up a 64-core cluster based on ROCKS-5.0 using eight Dell PE2900 boxes.
All are dual-processor QC machines.
compiled MVAPICH-1.0 (using intel compiler) with default parameters in make.mvapich.gen2
IB stack is OFED-1.2.5.5.
My MD program (PMEMD/AMBER) is compiled with no errors with IFORT/MKL libraries and
I could run the code on all 64 cores, but the scaling from 16 to 32 to 64 is terrible. I am inclosing
the benchmarks on a test run.
# of CPUs/cores Time (sec) Nodes (load-balanced) Scaling (%)
8 82 8 100
16 49 8 84
32 42 8 49
64 39 8 26
In contrast, on single box, I get a reasonable scaling.
# cores time (sec)
2 284 (100%)
4 164 (87%
8 107 (65%)
For some reason, I suspect, MPI traffic is not going over IB net.
MVAPICH is built using make.mvapich.gen2 with F77=ifort and CC=gcc
mpif77 -link_info is:
/state/partition1/fc91052/bin/ifort -L/usr/local/ofed/lib64 -L/usr/local/mvapich/lib
-lmpich -L/usr/local/ofed/lib64 -Wl,-rpath=/usr/local/ofed/lib64 -libverbs
-libumad -lpthread -lpthread -lrt
How can I be sure that MPI traffic is going through IB network rather than ethernet?
Are there any specific checks I should perform?
Thanks a lot for your help.
Divi
------------------------------------------------------------------------------
_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu
http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080607/4d5c9074/attachment-0001.html
More information about the mvapich-discuss
mailing list