[mvapich-discuss] Verify the application is really running

Jonathan L. Perkins perkinjo at cse.ohio-state.edu
Tue Sep 4 11:27:00 EDT 2007


Can you send us the output from

/usr/local/topspin/mpi/mpich/bin/mpicc -v

This will let us know what compiler and which version of it that you're
using.  We do not see this problem when trying in our environment using
a somewhat recent version of gcc.

Below I pasted line 66 with a few lines of context around it.  There is
no for loop here so I'm a bit confused as to why you're getting the
errors that you posted.


int main(int argc, char *argv[])
{

    int myid, numprocs, i;
    int size;
    MPI_Status reqstat;
    char *s_buf, *r_buf;
    int align_size;


Also, as a sanity check, can you download the osu_latency.c file again from
https://mvapich.cse.ohio-state.edu/svn/mpi/mvapich/trunk/osu_benchmarks/osu_latency.c

to verify that we are referencing the same file.  Thanks for your input
and we hope with further information we can solve this compilation issue
that you're having.


wgy at altair.com.cn wrote:
> Hello:
> I am quite sure I used the one you referred and got the compile error as
> you can see from the message.. I just renamed osu_latency.c to lantency.c
> while uploading...
> Thanks.
> Henry, Wu.
> 
> | Are you sure you are using the osu_latency.c file from mvapich web
> | site?  Your e-mail indicates about using a `latency.c' file.
> |
> | FYI, the osu_latency.c benchmark (latest version v2.2) is available
> | from the following URL:
> |
> |
> https://mvapich.cse.ohio-state.edu/svn/mpi/mvapich/trunk/osu_benchmarks/osu_latency.c
> |
> | DK
> |
> |>
> |> Hello, Jeff and Dr, Panda:
> |> I get back with the following resluts and further ques:
> |> 1)latency test reslut:
> |> the latency.c I downloaded from mvapich website can be complied in the
> |> cluster with errors:
> |> [radioss at hpc-node-01 job1]$ /usr/local/topspin/mpi/mpich/bin/mpicc
> |> latency.c -o lat
> |> latency.c:66: error: syntax error before "for"
> |> latency.c:68: error: `i' undeclared here (not in a function)
> |> latency.c:68: warning: data definition has no type or storage class
> |> latency.c:69: error: syntax error before '}' token
> |> latency.c:73: error: `skip_large' undeclared here (not in a function)
> |> latency.c:73: warning: data definition has no type or storage class
> |> latency.c:74: error: syntax error before '}' token
> |> latency.c:76: warning: parameter names (without types) in function
> |> declaration
> |> latency.c:76: warning: data definition has no type or storage class
> |> latency.c:78: error: syntax error before "if"
> |> latency.c:82: error: syntax error before numeric constant
> |> latency.c:82: warning: data definition has no type or storage class
> |> latency.c:83: error: syntax error before numeric constant
> |> latency.c:84: warning: data definition has no type or storage class
> |> latency.c:86: error: initializer element is not constant
> |> latency.c:86: warning: data definition has no type or storage class
> |> latency.c:88: error: syntax error before '}' token
> |> latency.c:92: error: syntax error before numeric constant
> |> latency.c:92: warning: data definition has no type or storage class
> |> latency.c:98: error: `t_start' undeclared here (not in a function)
> |> latency.c:98: error: `loop' undeclared here (not in a function)
> |> latency.c:98: warning: data definition has no type or storage class
> |> latency.c:99: error: syntax error before string constant
> |> latency.c:99: warning: conflicting types for built-in function 'fprintf'
> |> latency.c:99: warning: data definition has no type or storage class
> |> latency.c:104: warning: data definition has no type or storage class
> |> latency.c:105: error: syntax error before "return"
> |> latency.c:107:2: warning: no newline at end of file
> |> latency.c:68: error: storage size of `r_buf' isn't know
> |>
> |> I had to use mpi_latency.c shipped with mvapich in the cluster and got
> |> the
> |> following latency test results.
> |>
> |> [radioss at hpc-node-01 job1]$ /usr/local/topspin/mpi/mpich/bin/mpirun_rsh
> |> -np 2 -hostfile appfile ./lat 10000 1
> |> 1       6.288650
> |> [radioss at hpc-node-01 job1]$ /usr/local/topspin/mpi/mpich/bin/mpirun_rsh
> |> -np 2 -hostfile appfile ./lat 10000 4
> |> 4       6.410350
> |> while Topspin's Host-Side Drivers User Guide for Linux Release 3.1.0
> |> gives
> |> the following latency test figure as an example:
> |> [root at qa-bc1-blade2 root]# /usr/local/topspin/mpi/mpich/bin/mpirun_ssh
> |> -np
> |> 2 qabc1-
> |> blade2 qa-bc1-blade3 /usr/local/topspin/mpi/mpich/bin/mpi_latency 10000
> |> 1
> |> 1 6.684000
> |> 2) Jeff Squyres once asked me:
> |> >> I have 4-cores nodes here..
> |> >> I would expect to run it as:
> |> >> /usr/local/topspin/mpi/mpich/bin/mpirun_ssh -np 2 -hostfile hosts
> |>
> |> >^^ Is that the right path?  Or is it "mvapich"?  Regardless, I think
> |> wherever you find mpirun_ssh under /usr/local/topspin/mpi is probably
> |> the right one.
> |> the path is right, and I am pretty sure it is mavapich because:
> |> i)rpm -qf /usr/local/topspin/mpi/mpich/bin/mpirun_ssh gives:
> |> topspin-ib-mpi-rhel4-3.2.0-118
> |> ii)[radioss at hpc-node-01 local]$
> |> /usr/local/topspin/mpi/mpich/bin/mpirun_rsh -v
> |> OSU MVAPICH VERSION 0.9.5-SingleRail
> |>
> |> 3)when I try to use hp mpi 2.2.5 over the IB network I got the
> |> following:
> |> [radioss at hpc-node-01 job1]$ /opt/hpmpi/bin/mpirun -stdio=i0
> |> -cpu_bind=cyclic -VAPI  -f appfile < PFTANKD01
> |> dlopen test for MPI_ICLIB_VAPI__VAPI_MAIN could not open libs in list
> |> libmtl_common.so   libmpga.so      libmosal.so     libvapi.so:
> |> /usr/local/topspin/lib64/libmosal.so: undefined symbol: pthread_create
> |> dlopen test for MPI_ICLIB_VAPI__VAPI_CISCO could not open libs in list
> |> libpthread.so     libmosal.so     libvapi.so: /usr/lib64/libpthread.so:
> |> invalid ELF header
> |> mpid: MPI BUG: VAPI requested but not available
> |> what does it probably indicate? anything is wrong with the IB
> |> configuration?
> |>
> |> RPM packages installed there:
> |> [radioss at hpc-node-01 job1]$ rpm -qa|grep topspin
> |> topspin-ib-rhel4-3.2.0-118
> |> topspin-ib-mpi-rhel4-3.2.0-118
> |> topspin-ib-mod-rhel4-2.6.9-42.ELsmp-3.2.0-118
> |>
> |> 4)You guys suggested me to use HP MPI (no native mvapich) and OFED IB
> |> stack if possible.
> |> now I have some questiosn hope you can have a quick comment or refer me
> |> some website link so that I can read through:
> |> i)how to verify which IB stack is used here, OFED or Cisco/Topspin IB
> |> stack? what's the advantages of OFED IB stack over Cisco/Topspin IB
> |> stack?
> |> ii)what's the advatages of HP HPI over "native mvapich"? what means by
> |> "native mvapich"? the one shipped with Cisco/Topspin? is it enough to
> |> upgrade mvapich to the latest one which is availabel on mvapich website?
> |>
> |> Thanks a lot for all of you for your kindly help!
> |>
> |> Henry, Wu.
> |>
> |>
> |>
> |>
> |> | On Aug 29, 2007, at 12:54 PM, wgy at altair.com.cn wrote:
> |> |
> |> |> Yes, I think I used mavapich shipped with Topspin, but I am not sure
> |> |> unless I  know how to verify it.
> |> |
> |> | If it's in the /usr/local/topspin directory, it's the Topspin (later
> |> | Cisco) MVAPICH.
> |> |
> |> |> about latency test, I downloaded
> |> |> https://mvapich.cse.ohio-state.edu/svn/mpi/mvapich/trunk/
> |> |> osu_benchmarks/osu_latency.c
> |> |> and will compile it to run a benchmark. Can you please tell me how
> |> |> should
> |> |> I run it? how many nodes should be used and how many cpus should be
> |> |> involved?
> |> |
> |> | You typically run it with 2 MPI processes; one on each host.  It
> |> | measures the MPI network latency between those two hosts.
> |> |
> |> |> I have 4-cores nodes here..
> |> |> I would expect to run it as:
> |> |> /usr/local/topspin/mpi/mpich/bin/mpirun_ssh -np 2 -hostfile hosts
> |> |
> |> | ^^ Is that the right path?  Or is it "mvapich"?  Regardless, I think
> |> | wherever you find mpirun_ssh under /usr/local/topspin/mpi is probably
> |> | the right one.
> |> |
> |> |> osu_latency.o
> |> |
> |> | Is your executable really named osu_latency.o?  That's uncommon.
> |> | Regardless, run the executable that you got when you compiled
> |> | osu_latency.c with mpicc.
> |> |
> |> |> and include the following in the hosts file
> |> |> hpc-node-01
> |> |> hpc-node-02
> |> |
> |> | Sounds right.  I'm not an MVAPICH expert, though -- so I defer to the
> |> | maintainers here on this list for the finer details...
> |> |
> |> |> Is it right?
> |> |> Thanks a  lot, I am really a newbie with Infiniband....
> |> |
> |> | If this is your own system, I do want to stress that OFED is really
> |> | the way to go with HPC InfiniBand installations these days.  The
> |> | MPI's that are included are much more recent, and all new development
> |> | work is happening in the OFED arena.
> |> |
> |> | I recommend that you upgrade if you can.
> |> |
> |> |
> |> |> Henry, Wu
> |> |>
> |> |>
> |> |> | On Aug 29, 2007, at 12:25 PM, wgy at altair.com.cn wrote:
> |> |> |
> |> |> |> Hello, Jeff:
> |> |> |> The mvapich version is OSU mvapich0.95.
> |> |> |> does it mean that it is Cisco IB stack and therefor the
> |> application
> |> |> |> I run
> |> |> |> with mvapich is really running over IB network?
> |> |> |
> |> |> | The version of MVAPICH, by itself, does not mean that it is or is
> |> |> not
> |> |> | running over IB.
> |> |> |
> |> |> | What *implies* that you are running over IB is:
> |> |> |
> |> |> | - You implied that you are using the MVAPICH shipped with the
> |> |> Topspin
> |> |> | IB stack (which is not OFED).  Is that correct?
> |> |> | - I *believe* that the Topspin MVAPICH did not have TCP support
> |> |> | compiled into it (Topspin was before my time, but I am pretty sure
> |> |> | that the Cisco MVAPICH shipped with the Cisco IB stack does not)
> |> |> |
> |> |> | What would *prove* that you are using IB (vs. gige) is:
> |> |> |
> |> |> | - Run a simple latency test, as Dr. Panda suggested.  Your latency
> |> |> | should be single-digit microseconds (exact numbers depend on your
> |> |> | hardware -- this might be all older stuff since you mentioned
> |> |> | "Topspin", not "Cisco"; Topspin was acquired by Cisco quite a while
> |> |> | ago...).  If your latency is much higher than that (e.g., 50 us),
> |> |> | you're using gige.
> |> |> |
> |> |> |
> |> |> |
> |> |> |> Thanks.
> |> |> |>
> |> |> |> Henry, Wu.
> |> |> |> | In addition to what Dr. Panda said, Cisco recommends that all
> |> HPC
> |> |> |> | customers upgrade to the OFED IB driver stack if possible (some
> |> |> |> | customers cannot upgrade for various reasons).  FWIW: all new
> |> |> HPC/
> |> |> |> MPI
> |> |> |> | work is occurring in the OFED arena.
> |> |> |> |
> |> |> |> | I bring this up because you specifically mention Topspin
> |> |> Infiniband,
> |> |> |> | which I'm *assuming* is the Cisco IB stack (not the OFED IB
> |> |> stack),
> |> |> |> | and is therefore shipping with a somewhat older version of
> |> |> MVAPICH
> |> |> |> | that was derived from the OSU MVAPICH.  The Cisco MVAPICH should
> |> |> |> only
> |> |> |> | be compiled with IB support enabled; a simple latency test
> |> should
> |> |> |> | prove that you're running over IB and not ethernet.
> |> |> |> |
> |> |> |> | Much more recent versions of MPI implementations are included
> |> |> with
> |> |> |> | the OFED stack (Cisco provides binary distributions of OFED on
> |> |> |> | www.cisco.com).
> |> |> |> |
> |> |> |> |
> |> |> |> | On Aug 29, 2007, at 11:44 AM, Dhabaleswar Panda wrote:
> |> |> |> |
> |> |> |> |>
> |> |> |> |>
> |> |> |> |> On Wed, 29 Aug 2007 wgy at altair.com.cn wrote:
> |> |> |> |>
> |> |> |> |>> Hello, list:
> |> |> |> |>> It might be a silly questions but I wonder how to verify run
> |> |> with
> |> |> |> |>> mvapich
> |> |> |> |>> (come with Topspin Infiniband) over Infiniband, NOT Gigabite
> |> |> |> network.
> |> |> |> |>> Is there an option to force mvapich to use IB network
> |> otherwise
> |> |> |> |>> just exits?
> |> |> |> |>
> |> |> |> |> MVAPICH has several underlying interfaces: Gen2, uDAPL, VAPI,
> |> |> TCP/
> |> |> |> |> IP and
> |> |> |> |> shared memory. Please take a look at the user guide
> |> |> (available from
> |> |> |> |> mvapich project page) to see the differences and capabilities
> |> of
> |> |> |> these
> |> |> |> |> interfaces. Gen2 interface (corresponding to OFED) will give
> |> you
> |> |> |> |> the best performance and scalability. If you have OFED stack
> |> |> |> |> installed,
> |> |> |> |> you should be able to configure mvapich to run over Gen2
> |> |> interface
> |> |> |> |> (as per the instructions indicated in the user guide). During
> |> |> OFED
> |> |> |> |> installation, you can also select mvapich from the package.
> |> |> |> |>
> |> |> |> |> On your existing installation, you can also run OSU benchmarks
> |> |> |> (such
> |> |> |> |> as OSU latency). If you get latency number in the range of 2~4
> |> |> |> |> microsec
> |> |> |> |> for short messages (say 4 bytes), it is already running over
> |> the
> |> |> |> |> native
> |> |> |> |> IB.
> |> |> |> |>
> |> |> |> |> Hope this helps.
> |> |> |> |>
> |> |> |> |> DK
> |> |> |> |>
> |> |> |> |>> Thanks for your suggestion.
> |> |> |> |>> Rdgs.
> |> |> |> |>> Henry, Wu
> |> |> |> |>>
> |> |> |> |>> _______________________________________________
> |> |> |> |>> mvapich-discuss mailing list
> |> |> |> |>> mvapich-discuss at cse.ohio-state.edu
> |> |> |> |>>
> |> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> |> |> |> |>>
> |> |> |> |>
> |> |> |> |> _______________________________________________
> |> |> |> |> mvapich-discuss mailing list
> |> |> |> |> mvapich-discuss at cse.ohio-state.edu
> |> |> |> |> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> |> |> |> |
> |> |> |> |
> |> |> |> | --
> |> |> |> | Jeff Squyres
> |> |> |> | Cisco Systems
> |> |> |> |
> |> |> |> |
> |> |> |
> |> |> |
> |> |> | --
> |> |> | Jeff Squyres
> |> |> | Cisco Systems
> |> |> |
> |> |> |
> |> |
> |> |
> |> | --
> |> | Jeff Squyres
> |> | Cisco Systems
> |> |
> |> |
> |>
> |>
> |> _______________________________________________
> |> mvapich-discuss mailing list
> |> mvapich-discuss at cse.ohio-state.edu
> |> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> |>
> |
> |
> 
> 
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss


-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


More information about the mvapich-discuss mailing list