[mvapich-discuss] Verify the application is really running

Dhabaleswar Panda panda at cse.ohio-state.edu
Tue Sep 4 02:10:13 EDT 2007


Are you sure you are using the osu_latency.c file from mvapich web
site?  Your e-mail indicates about using a `latency.c' file.

FYI, the osu_latency.c benchmark (latest version v2.2) is available
from the following URL:

https://mvapich.cse.ohio-state.edu/svn/mpi/mvapich/trunk/osu_benchmarks/osu_latency.c

DK

> 
> Hello, Jeff and Dr, Panda:
> I get back with the following resluts and further ques:
> 1)latency test reslut:
> the latency.c I downloaded from mvapich website can be complied in the
> cluster with errors:
> [radioss at hpc-node-01 job1]$ /usr/local/topspin/mpi/mpich/bin/mpicc
> latency.c -o lat
> latency.c:66: error: syntax error before "for"
> latency.c:68: error: `i' undeclared here (not in a function)
> latency.c:68: warning: data definition has no type or storage class
> latency.c:69: error: syntax error before '}' token
> latency.c:73: error: `skip_large' undeclared here (not in a function)
> latency.c:73: warning: data definition has no type or storage class
> latency.c:74: error: syntax error before '}' token
> latency.c:76: warning: parameter names (without types) in function
> declaration
> latency.c:76: warning: data definition has no type or storage class
> latency.c:78: error: syntax error before "if"
> latency.c:82: error: syntax error before numeric constant
> latency.c:82: warning: data definition has no type or storage class
> latency.c:83: error: syntax error before numeric constant
> latency.c:84: warning: data definition has no type or storage class
> latency.c:86: error: initializer element is not constant
> latency.c:86: warning: data definition has no type or storage class
> latency.c:88: error: syntax error before '}' token
> latency.c:92: error: syntax error before numeric constant
> latency.c:92: warning: data definition has no type or storage class
> latency.c:98: error: `t_start' undeclared here (not in a function)
> latency.c:98: error: `loop' undeclared here (not in a function)
> latency.c:98: warning: data definition has no type or storage class
> latency.c:99: error: syntax error before string constant
> latency.c:99: warning: conflicting types for built-in function 'fprintf'
> latency.c:99: warning: data definition has no type or storage class
> latency.c:104: warning: data definition has no type or storage class
> latency.c:105: error: syntax error before "return"
> latency.c:107:2: warning: no newline at end of file
> latency.c:68: error: storage size of `r_buf' isn't know
> 
> I had to use mpi_latency.c shipped with mvapich in the cluster and got the
> following latency test results.
> 
> [radioss at hpc-node-01 job1]$ /usr/local/topspin/mpi/mpich/bin/mpirun_rsh
> -np 2 -hostfile appfile ./lat 10000 1
> 1       6.288650
> [radioss at hpc-node-01 job1]$ /usr/local/topspin/mpi/mpich/bin/mpirun_rsh
> -np 2 -hostfile appfile ./lat 10000 4
> 4       6.410350
> while Topspin's Host-Side Drivers User Guide for Linux Release 3.1.0 gives
> the following latency test figure as an example:
> [root at qa-bc1-blade2 root]# /usr/local/topspin/mpi/mpich/bin/mpirun_ssh -np
> 2 qabc1-
> blade2 qa-bc1-blade3 /usr/local/topspin/mpi/mpich/bin/mpi_latency 10000 1
> 1 6.684000
> 2) Jeff Squyres once asked me:
> >> I have 4-cores nodes here..
> >> I would expect to run it as:
> >> /usr/local/topspin/mpi/mpich/bin/mpirun_ssh -np 2 -hostfile hosts
> 
> >^^ Is that the right path?  Or is it "mvapich"?  Regardless, I think
> wherever you find mpirun_ssh under /usr/local/topspin/mpi is probably
> the right one.
> the path is right, and I am pretty sure it is mavapich because:
> i)rpm -qf /usr/local/topspin/mpi/mpich/bin/mpirun_ssh gives:
> topspin-ib-mpi-rhel4-3.2.0-118
> ii)[radioss at hpc-node-01 local]$
> /usr/local/topspin/mpi/mpich/bin/mpirun_rsh -v
> OSU MVAPICH VERSION 0.9.5-SingleRail
> 
> 3)when I try to use hp mpi 2.2.5 over the IB network I got the following:
> [radioss at hpc-node-01 job1]$ /opt/hpmpi/bin/mpirun -stdio=i0
> -cpu_bind=cyclic -VAPI  -f appfile < PFTANKD01
> dlopen test for MPI_ICLIB_VAPI__VAPI_MAIN could not open libs in list
> libmtl_common.so   libmpga.so      libmosal.so     libvapi.so:
> /usr/local/topspin/lib64/libmosal.so: undefined symbol: pthread_create
> dlopen test for MPI_ICLIB_VAPI__VAPI_CISCO could not open libs in list
> libpthread.so     libmosal.so     libvapi.so: /usr/lib64/libpthread.so:
> invalid ELF header
> mpid: MPI BUG: VAPI requested but not available
> what does it probably indicate? anything is wrong with the IB configuration?
> 
> RPM packages installed there:
> [radioss at hpc-node-01 job1]$ rpm -qa|grep topspin
> topspin-ib-rhel4-3.2.0-118
> topspin-ib-mpi-rhel4-3.2.0-118
> topspin-ib-mod-rhel4-2.6.9-42.ELsmp-3.2.0-118
> 
> 4)You guys suggested me to use HP MPI (no native mvapich) and OFED IB
> stack if possible.
> now I have some questiosn hope you can have a quick comment or refer me
> some website link so that I can read through:
> i)how to verify which IB stack is used here, OFED or Cisco/Topspin IB
> stack? what's the advantages of OFED IB stack over Cisco/Topspin IB stack?
> ii)what's the advatages of HP HPI over "native mvapich"? what means by
> "native mvapich"? the one shipped with Cisco/Topspin? is it enough to
> upgrade mvapich to the latest one which is availabel on mvapich website?
> 
> Thanks a lot for all of you for your kindly help!
> 
> Henry, Wu.
> 
> 
> 
> 
> | On Aug 29, 2007, at 12:54 PM, wgy at altair.com.cn wrote:
> |
> |> Yes, I think I used mavapich shipped with Topspin, but I am not sure
> |> unless I  know how to verify it.
> |
> | If it's in the /usr/local/topspin directory, it's the Topspin (later
> | Cisco) MVAPICH.
> |
> |> about latency test, I downloaded
> |> https://mvapich.cse.ohio-state.edu/svn/mpi/mvapich/trunk/
> |> osu_benchmarks/osu_latency.c
> |> and will compile it to run a benchmark. Can you please tell me how
> |> should
> |> I run it? how many nodes should be used and how many cpus should be
> |> involved?
> |
> | You typically run it with 2 MPI processes; one on each host.  It
> | measures the MPI network latency between those two hosts.
> |
> |> I have 4-cores nodes here..
> |> I would expect to run it as:
> |> /usr/local/topspin/mpi/mpich/bin/mpirun_ssh -np 2 -hostfile hosts
> |
> | ^^ Is that the right path?  Or is it "mvapich"?  Regardless, I think
> | wherever you find mpirun_ssh under /usr/local/topspin/mpi is probably
> | the right one.
> |
> |> osu_latency.o
> |
> | Is your executable really named osu_latency.o?  That's uncommon.
> | Regardless, run the executable that you got when you compiled
> | osu_latency.c with mpicc.
> |
> |> and include the following in the hosts file
> |> hpc-node-01
> |> hpc-node-02
> |
> | Sounds right.  I'm not an MVAPICH expert, though -- so I defer to the
> | maintainers here on this list for the finer details...
> |
> |> Is it right?
> |> Thanks a  lot, I am really a newbie with Infiniband....
> |
> | If this is your own system, I do want to stress that OFED is really
> | the way to go with HPC InfiniBand installations these days.  The
> | MPI's that are included are much more recent, and all new development
> | work is happening in the OFED arena.
> |
> | I recommend that you upgrade if you can.
> |
> |
> |> Henry, Wu
> |>
> |>
> |> | On Aug 29, 2007, at 12:25 PM, wgy at altair.com.cn wrote:
> |> |
> |> |> Hello, Jeff:
> |> |> The mvapich version is OSU mvapich0.95.
> |> |> does it mean that it is Cisco IB stack and therefor the application
> |> |> I run
> |> |> with mvapich is really running over IB network?
> |> |
> |> | The version of MVAPICH, by itself, does not mean that it is or is
> |> not
> |> | running over IB.
> |> |
> |> | What *implies* that you are running over IB is:
> |> |
> |> | - You implied that you are using the MVAPICH shipped with the
> |> Topspin
> |> | IB stack (which is not OFED).  Is that correct?
> |> | - I *believe* that the Topspin MVAPICH did not have TCP support
> |> | compiled into it (Topspin was before my time, but I am pretty sure
> |> | that the Cisco MVAPICH shipped with the Cisco IB stack does not)
> |> |
> |> | What would *prove* that you are using IB (vs. gige) is:
> |> |
> |> | - Run a simple latency test, as Dr. Panda suggested.  Your latency
> |> | should be single-digit microseconds (exact numbers depend on your
> |> | hardware -- this might be all older stuff since you mentioned
> |> | "Topspin", not "Cisco"; Topspin was acquired by Cisco quite a while
> |> | ago...).  If your latency is much higher than that (e.g., 50 us),
> |> | you're using gige.
> |> |
> |> |
> |> |
> |> |> Thanks.
> |> |>
> |> |> Henry, Wu.
> |> |> | In addition to what Dr. Panda said, Cisco recommends that all HPC
> |> |> | customers upgrade to the OFED IB driver stack if possible (some
> |> |> | customers cannot upgrade for various reasons).  FWIW: all new
> |> HPC/
> |> |> MPI
> |> |> | work is occurring in the OFED arena.
> |> |> |
> |> |> | I bring this up because you specifically mention Topspin
> |> Infiniband,
> |> |> | which I'm *assuming* is the Cisco IB stack (not the OFED IB
> |> stack),
> |> |> | and is therefore shipping with a somewhat older version of
> |> MVAPICH
> |> |> | that was derived from the OSU MVAPICH.  The Cisco MVAPICH should
> |> |> only
> |> |> | be compiled with IB support enabled; a simple latency test should
> |> |> | prove that you're running over IB and not ethernet.
> |> |> |
> |> |> | Much more recent versions of MPI implementations are included
> |> with
> |> |> | the OFED stack (Cisco provides binary distributions of OFED on
> |> |> | www.cisco.com).
> |> |> |
> |> |> |
> |> |> | On Aug 29, 2007, at 11:44 AM, Dhabaleswar Panda wrote:
> |> |> |
> |> |> |>
> |> |> |>
> |> |> |> On Wed, 29 Aug 2007 wgy at altair.com.cn wrote:
> |> |> |>
> |> |> |>> Hello, list:
> |> |> |>> It might be a silly questions but I wonder how to verify run
> |> with
> |> |> |>> mvapich
> |> |> |>> (come with Topspin Infiniband) over Infiniband, NOT Gigabite
> |> |> network.
> |> |> |>> Is there an option to force mvapich to use IB network otherwise
> |> |> |>> just exits?
> |> |> |>
> |> |> |> MVAPICH has several underlying interfaces: Gen2, uDAPL, VAPI,
> |> TCP/
> |> |> |> IP and
> |> |> |> shared memory. Please take a look at the user guide
> |> (available from
> |> |> |> mvapich project page) to see the differences and capabilities of
> |> |> these
> |> |> |> interfaces. Gen2 interface (corresponding to OFED) will give you
> |> |> |> the best performance and scalability. If you have OFED stack
> |> |> |> installed,
> |> |> |> you should be able to configure mvapich to run over Gen2
> |> interface
> |> |> |> (as per the instructions indicated in the user guide). During
> |> OFED
> |> |> |> installation, you can also select mvapich from the package.
> |> |> |>
> |> |> |> On your existing installation, you can also run OSU benchmarks
> |> |> (such
> |> |> |> as OSU latency). If you get latency number in the range of 2~4
> |> |> |> microsec
> |> |> |> for short messages (say 4 bytes), it is already running over the
> |> |> |> native
> |> |> |> IB.
> |> |> |>
> |> |> |> Hope this helps.
> |> |> |>
> |> |> |> DK
> |> |> |>
> |> |> |>> Thanks for your suggestion.
> |> |> |>> Rdgs.
> |> |> |>> Henry, Wu
> |> |> |>>
> |> |> |>> _______________________________________________
> |> |> |>> mvapich-discuss mailing list
> |> |> |>> mvapich-discuss at cse.ohio-state.edu
> |> |> |>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> |> |> |>>
> |> |> |>
> |> |> |> _______________________________________________
> |> |> |> mvapich-discuss mailing list
> |> |> |> mvapich-discuss at cse.ohio-state.edu
> |> |> |> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> |> |> |
> |> |> |
> |> |> | --
> |> |> | Jeff Squyres
> |> |> | Cisco Systems
> |> |> |
> |> |> |
> |> |
> |> |
> |> | --
> |> | Jeff Squyres
> |> | Cisco Systems
> |> |
> |> |
> |
> |
> | --
> | Jeff Squyres
> | Cisco Systems
> |
> |
> 
> 
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 



More information about the mvapich-discuss mailing list