[mvapich-discuss] Re: How to check if MVAPICH is using IB networkbut not ethernetwork?

Dhabaleswar Panda panda at cse.ohio-state.edu
Sat Jun 7 13:59:14 EDT 2008


All-to-all is a much more complex operation. You will not be able to
quickly find out whether your configuration is correct or not based on
these numbers.

You can run osu_latency, osu_bw, osu_bibw tests between two processes
(across nodes and within nodes). Then you can compare your inter-node and
intra-node performance numbers with the numbers and graphs available from
the performance page of mvapich web site for different platforms, IB cards
and devices.

http://mvapich.cse.ohio-state.edu/performance/

If these numbers match/come close, you can be very sure that MVAPICH is
using IB network on your system. Otherwise, there could be some issues in
your installation.

DK


On Sat, 7 Jun 2008, Divi Venkateswarlu wrote:

>      One more to add:
>
>      OSU_BENCHMARKS are as follows:
>
> [root at divilab osu_benchmarks]# mpirun_rsh -np 64 -hostfile ./host1 osu_alltoall
> # OSU MPI All-to-All Personalized Exchange Latency Test v3.0
> # Size            Latency (us)
> 1                       651.27
> 2                       650.83
> 4                       647.61
> 8                       669.23
> 16                      658.48
> 32                      652.27
> 64                      663.42
> 128                     698.62
> 256                     795.00
> 512                    2298.14
> 1024                   2894.17
> 2048                   4510.67
> 4096                   8636.66
> 8192                  18353.73
> 16384                 34942.05
> 32768                 43737.38
> 65536                 71297.06
> 131072               138771.72
> 262144               273233.08
> 524288               543174.56
> 1048576             1086598.00
>
>   Are these as expected?
>
>
>   ----- Original Message -----
>   From: Divi Venkateswarlu
>   To: mvapich-discuss at cse.ohio-state.edu
>   Sent: Saturday, June 07, 2008 9:07 AM
>   Subject: [mvapich-discuss] Re: How to check if MVAPICH is using IB networkbut not ethernetwork?
>
>
>
>          one more to add:
>
>           My IB hardware         8-port Flextronics SDR switch
>                                        MHES18   Mallanox HCA cards
>
>          ibchecknet shows the following
>
>    [root at divilab bin]# ibchecknet
>
>   # Checking Ca: nodeguid 0x0002c90200244228
>
>   # Checking Ca: nodeguid 0x0002c90200244230
>
>   # Checking Ca: nodeguid 0x0002c902002740fc
>
>   # Checking Ca: nodeguid 0x0002c902002441a4
>
>   # Checking Ca: nodeguid 0x0002c902002441c4
>
>   # Checking Ca: nodeguid 0x0002c9020024422c
>
>   # Checking Ca: nodeguid 0x0002c902002441ac
>
>   # Checking Ca: nodeguid 0x0002c9020024418c
>
>   ## Summary: 9 nodes checked, 0 bad nodes found
>   ##          16 ports checked, 0 bad ports found
>   ##          0 ports have errors beyond threshold
>
>     ----- Original Message -----
>     From: Divi Venkateswarlu
>     To: mvapich-discuss at cse.ohio-state.edu
>     Sent: Saturday, June 07, 2008 9:02 AM
>     Subject: How to check if MVAPICH is using IB network but not ethernetwork?
>
>
>
>            Hello all:
>
>            Good morning!
>            I set up a 64-core cluster based on ROCKS-5.0 using eight Dell PE2900 boxes.
>            All are dual-processor QC machines.
>
>            compiled MVAPICH-1.0 (using intel compiler) with default parameters in make.mvapich.gen2
>            IB stack is OFED-1.2.5.5.
>
>             My MD program (PMEMD/AMBER) is compiled with no errors with IFORT/MKL libraries and
>            I could run the code on all 64 cores, but the scaling from 16 to 32 to 64 is terrible. I am inclosing
>            the benchmarks on a test run.
>
>                   # of CPUs/cores   Time (sec)         Nodes (load-balanced)      Scaling (%)
>                      8                            82                      8                                        100
>                    16                            49                      8                                          84
>                   32                            42                      8                                           49
>                   64                            39                      8                                           26
>
>                In contrast, on single box, I get a reasonable scaling.
>
>            # cores     time (sec)
>               2              284    (100%)
>               4              164   (87%
>               8              107    (65%)
>
>            For some reason, I suspect, MPI traffic is not going over IB net.
>
>                MVAPICH is built using make.mvapich.gen2 with F77=ifort and CC=gcc
>
>          mpif77 -link_info is:
>
>          /state/partition1/fc91052/bin/ifort -L/usr/local/ofed/lib64 -L/usr/local/mvapich/lib
>          -lmpich -L/usr/local/ofed/lib64 -Wl,-rpath=/usr/local/ofed/lib64 -libverbs
>         -libumad -lpthread -lpthread -lrt
>
>
>          How can I be sure that MPI traffic is going through IB network rather than ethernet?
>          Are there any specific checks I should perform?
>
>          Thanks a lot for your help.
>
>           Divi
>
>
>
> ------------------------------------------------------------------------------
>
>
>   _______________________________________________
>   mvapich-discuss mailing list
>   mvapich-discuss at cse.ohio-state.edu
>   http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list