[mvapich-discuss] Re: How to check if MVAPICH is using IB network
but not ethernetwork?
Divi Venkateswarlu
divi at ncat.edu
Sat Jun 7 09:07:12 EDT 2008
one more to add:
My IB hardware 8-port Flextronics SDR switch
MHES18 Mallanox HCA cards
ibchecknet shows the following
[root at divilab bin]# ibchecknet
# Checking Ca: nodeguid 0x0002c90200244228
# Checking Ca: nodeguid 0x0002c90200244230
# Checking Ca: nodeguid 0x0002c902002740fc
# Checking Ca: nodeguid 0x0002c902002441a4
# Checking Ca: nodeguid 0x0002c902002441c4
# Checking Ca: nodeguid 0x0002c9020024422c
# Checking Ca: nodeguid 0x0002c902002441ac
# Checking Ca: nodeguid 0x0002c9020024418c
## Summary: 9 nodes checked, 0 bad nodes found
## 16 ports checked, 0 bad ports found
## 0 ports have errors beyond threshold
----- Original Message -----
From: Divi Venkateswarlu
To: mvapich-discuss at cse.ohio-state.edu
Sent: Saturday, June 07, 2008 9:02 AM
Subject: How to check if MVAPICH is using IB network but not ethernetwork?
Hello all:
Good morning!
I set up a 64-core cluster based on ROCKS-5.0 using eight Dell PE2900 boxes.
All are dual-processor QC machines.
compiled MVAPICH-1.0 (using intel compiler) with default parameters in make.mvapich.gen2
IB stack is OFED-1.2.5.5.
My MD program (PMEMD/AMBER) is compiled with no errors with IFORT/MKL libraries and
I could run the code on all 64 cores, but the scaling from 16 to 32 to 64 is terrible. I am inclosing
the benchmarks on a test run.
# of CPUs/cores Time (sec) Nodes (load-balanced) Scaling (%)
8 82 8 100
16 49 8 84
32 42 8 49
64 39 8 26
In contrast, on single box, I get a reasonable scaling.
# cores time (sec)
2 284 (100%)
4 164 (87%
8 107 (65%)
For some reason, I suspect, MPI traffic is not going over IB net.
MVAPICH is built using make.mvapich.gen2 with F77=ifort and CC=gcc
mpif77 -link_info is:
/state/partition1/fc91052/bin/ifort -L/usr/local/ofed/lib64 -L/usr/local/mvapich/lib
-lmpich -L/usr/local/ofed/lib64 -Wl,-rpath=/usr/local/ofed/lib64 -libverbs
-libumad -lpthread -lpthread -lrt
How can I be sure that MPI traffic is going through IB network rather than ethernet?
Are there any specific checks I should perform?
Thanks a lot for your help.
Divi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080607/a86b1dc3/attachment.html
More information about the mvapich-discuss
mailing list