[mvapich-discuss] viacheck.c error?

Aquarijen aquarijen at gmail.com
Wed Jan 24 09:01:23 EST 2007


Sayantan,

You rock.  I'll try your suggestions and get back to you.  I'm still
learning about IB and I appreciate your patience with such a newbie.
:)
Jen

On 1/23/07, Sayantan Sur <surs at cse.ohio-state.edu> wrote:
> Hello Jen,
>
> The OSU benchmarks should be ideally run for 2 processes. Can you try
> osu_latency, osu_bw, osu_bibw with just 2 processes? I have a feeling
> that the cluster isn't set up quite right, otherwise these simple
> benchmarks wouldn't fail. Are other users able to run IMB on the cluster?
>
> cpilog might have compilation problems since MPE might have not been
> compiled in when MVAPICH was built. To enable MPE, use --with-mpe as a
> configure parameter in make.mvapich.gen2. (assuming you have downloaded
> MVAPICH-0.9.8 from our website).
>
> Similarly, with simpleio, MPIIO component needs to be compiled in when
> building MVAPICH. Use --with-romio as a configure parameter.
>
> Thanks,
> Sayantan.
>
> Aquarijen wrote:
> > Hi Sayantan,
> >
> > Thank you for your help. :)
> >
> > A few things about my environment.  The compute nodes are 64 bit, so I
> > pointed the mvapich compilation to /usr/ofed/lib64 - I have no 32 bit
> > libs for ofed.  The compute nodes have 2 processors each.  When I have
> > tried jobs, I have submitted them through pbs (torque) and specified
> > that I want 1 processor per node - maui enforces this.  I have tried
> > all my runs with 40 nodes, one processor per node.
> >
> > cpi, cpip and hello++ run without problems.
> >
> > osu_bw fails with the error:
> > Connection closed by 172.16.4.36^M
> > [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
> > at line 2355 in file viacheck.c
> > done.
> >
> > osu_bcast fails with error:
> > [39] Abort: [b09n001.oic.ornl.gov:39] Got completion with error, code=1
> > at line 2355 in file viacheck.c
> > done.
> >
> > osu_bibw fails with:
> > [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
> > at line 2355 in file viacheck.c
> > [39] Abort: [32] Abort: [24] Abort: [38] Abort:
> > [b09n001.oic.ornl.gov:39] Got completion with error, code=12, dest
> > rank=0
> > at line 397 in file viacheck.c
> > [b09n008.oic.ornl.gov:32] Got completion with error, code=12, dest rank=0
> > at line 397 in file viacheck.c
> > [b09n016.oic.ornl.gov:24] Got completion with error, code=12, dest rank=0
> > at line 397 in file viacheck.c
> > [b09n002.oic.ornl.gov:38] Got completion with error, code=12, dest rank=0
> > at line 397 in file viacheck.c
> > [36] Abort: [b09n004.oic.ornl.gov:36] Got completion with error,
> > code=12, dest rank=0
> > at line 397 in file viacheck.c
> > done.
> >
> > osu_latency fails with:
> > [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
> > at line 2355 in file viacheck.c
> > done.
> >
> > I can't compile cpilog.c, I get:
> > [2vt at b09l02 osu_benchmarks-mvapich]$ which mpicc
> > /opt/mvapich-gcc-0.9.8/bin/mpicc
> > [2vt at b09l02 osu_benchmarks-mvapich]$ mpicc cpilog.c -o cpilog
> > cpilog.o(.text+0xd2): In function `main':
> > cpilog.c: undefined reference to `MPE_Init_log'
> > cpilog.o(.text+0xd7):cpilog.c: undefined reference to
> > `MPE_Log_get_event_number'cpilog.o(.text+0xdf):cpilog.c: undefined
> > reference to `MPE_Log_get_event_number'cpilog.o(.text+0xe7):cpilog.c:
> > undefined reference to
> > `MPE_Log_get_event_number'cpilog.o(.text+0xef):cpilog.c: undefined
> > reference to `MPE_Log_get_event_number'cpilog.o(.text+0xf7):cpilog.c:
> > undefined reference to
> > `MPE_Log_get_event_number'cpilog.o(.text+0xff):cpilog.c: more
> > undefined references to `MPE_Log_get_event_number' follow
> > cpilog.o(.text+0x12e): In function `main':
> > cpilog.c: undefined reference to `MPE_Describe_state'
> > cpilog.o(.text+0x143):cpilog.c: undefined reference to
> > `MPE_Describe_state'
> > cpilog.o(.text+0x158):cpilog.c: undefined reference to
> > `MPE_Describe_state'
> > cpilog.o(.text+0x16d):cpilog.c: undefined reference to
> > `MPE_Describe_state'
> > cpilog.o(.text+0x1a2):cpilog.c: undefined reference to `MPE_Start_log'
> > cpilog.o(.text+0x1c0):cpilog.c: undefined reference to `MPE_Log_event'
> > cpilog.o(.text+0x1f0):cpilog.c: undefined reference to `MPE_Log_event'
> > cpilog.o(.text+0x202):cpilog.c: undefined reference to `MPE_Log_event'
> > cpilog.o(.text+0x21e):cpilog.c: undefined reference to `MPE_Log_event'
> > cpilog.o(.text+0x230):cpilog.c: undefined reference to `MPE_Log_event'
> > cpilog.o(.text+0x2da):cpilog.c: more undefined references to
> > `MPE_Log_event' follow
> > cpilog.o(.text+0x342): In function `main':
> > cpilog.c: undefined reference to `MPE_Finish_log'
> > collect2: ld returned 1 exit status
> >
> > I also can't compile simpleio.c.  I get:
> > [2vt at b09l02 osu_benchmarks-mvapich]$ mpicc simpleio.c
> > simpleio.o(.text+0x252): In function `main':
> > simpleio.c: undefined reference to `MPI_File_open'
> > simpleio.o(.text+0x26e):simpleio.c: undefined reference to
> > `MPI_File_write'
> > simpleio.o(.text+0x277):simpleio.c: undefined reference to
> > `MPI_File_close'
> > simpleio.o(.text+0x2c0):simpleio.c: undefined reference to
> > `MPI_File_open'
> > simpleio.o(.text+0x2dc):simpleio.c: undefined reference to
> > `MPI_File_read'
> > simpleio.o(.text+0x2e5):simpleio.c: undefined reference to
> > `MPI_File_close'
> > collect2: ld returned 1 exit status
> >
> > Intel MPI benchmarks (IMB-MPI1) fail with:
> >
> > [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=4
> > at line 2355 in file viacheck.c
> > done.
> >
> >
> > I'd be happy to provide any opther logs or any other info you might
> > think would help!  Sorry it took me so long for this - I had a few
> > fires to put out.  Now, this is #1 priority.
> >
> > Thanks for all your help!!!!
> > Jen
> >
> >
> >
> > On 1/19/07, Sayantan Sur <surs at cse.ohio-state.edu> wrote:
> >> Hello Jen,
> >>
> >> > I am new.  And a little frusterated. :)
> >>
> >> Thanks for your post ... Hope your problems are short-lived :-)
> >>
> >> > I have compiled/installed mvapich 0.9.8 using ofed/gen2.
> >> >
> >> > I can run cpi on all my nodes just fine.  The problem comes in when I
> >> > try to use any of the osu benchmark programs.  They seem to compile
> >> > just fine, but when I try to run osu_bcast, I get the following error:
> >> >
> >> > [39] Abort: [b09n001.oic.ornl.gov:39] Got completion with error,
> >> code=1
> >> > at line 2355 in file viacheck.c
> >> > done.
> >> >
> >> > Where is this viacheck.c?  Has anyone seen this before?  I'd be happy
> >> > to provide more details if you tell me what to provide.
> >>
> >> viacheck.c is an internal file in the MVAPICH implementation. I have a
> >> couple of questions which you could answer ...
> >>
> >> 1) How many nodes was this run attempted? I have run osu_bcast on
> >> 64-nodes/128 processes and it seems to be OK.
> >>
> >> 2) Can you run IMB (Intel MPI benchmarks)? They will also call
> >> MPI_Bcast.
> >>
> >> 3) I'm wondering if you could run the other OSU benchmarks, such as
> >> latency, bandwidth, bi-directional bandwidth?
> >>
> >> Thanks,
> >> Sayantan.
> >>
> >> --
> >> http://www.cse.ohio-state.edu/~surs
> >>
> >
> >
>
> --
> http://www.cse.ohio-state.edu/~surs
>
>


-- 
The only people for me are the mad ones, the ones who are mad to live,
mad to talk, mad to be saved, desirous of everything at the same time,
the ones who never yawn or say a commonplace thing, but burn, burn,
burn, like fabulous yellow roman candles exploding like spiders across
the stars and in the middle you see the blue centerlight pop and
everybody goes "Awww!"-----Jack Kerouac


More information about the mvapich-discuss mailing list