[mvapich-discuss] viacheck.c error?

Abhinav Vishnu vishnu at cse.ohio-state.edu
Thu Feb 22 16:17:55 EST 2007


Hi Aquarijen,

Aquarijen wrote:
> Hi Sayantan and everyone,
>
> I had been pulled onto other projects for a while - sorry it has been
> so long for an update!  But now I'm back on this as my first priority
> to get working...  I've tried a few things.
> osu_latency, osu_bw and osu_bibw still fail with 2 processors - it is
> the same problem. :(  No user can run IMB - we get the same viacheck.c
> error for it as well.
>

Sorry to know that you are facing these problems. Can you please provide
us with more information with respect to these problems, if you have any?

Please find my response to the problem you have reported inline.
> Your suggestions for cpilog and simpleio worked fine and these run
> without problems now.  Just none of the benchmarks...
>
> So I thought I would try out the new mvapich 0.9.9 beta and see how it
> went.  I am having trouble compiling it and I think it may be a
> related problem?
> I have tried with icc and gcc.  We have gcc (GCC) 4.0.2 20051125 (Red
> Hat 4.0.2-8) and icc (ICC) 9.1 20061101.
>
> There are warnings in viainit.c and viarecv.c, but there is an error
> in viacheck.c.  Here is the error in icc:
> -------------------------------------------------------------------------------- 
>
> icc -DHAVE_CONFIG_H -I.
> -I/root/mvapich/mvapich-0.9.9-beta/mpid/ch_gen2
> -I/root/mvapich/mvapich-0.9.9-beta/include
> -I/root/mvapich/mvapich-0.9.9-beta/include
> -I/root/mvapich/mvapich-0.9.9-beta/mpid/ch_gen2
> -I/root/mvapich/mvapich-0.9.9-beta/mpid/util -DMPID_DEVICE_CODE
> -DHAVE_UNAME=1 -DHAVE_NETDB_H=1 -DHAVE_GETHOSTBYNAME=1
> -DMPID_DEBUG_NONE -DMPID_STAT_NONE  -D_GNU_SOURCE -fPIC -D_EM64T_
> -DEARLY_SEND_COMPLETION -DMEMORY_RELIABLE -DVIADEV_RPUT_SUPPORT
> -D_SMP_ -D_SMP_RNDV_ -DCH_GEN2 -D_ICC_  -I/usr/ofed/include -O3
> -DHAVE_MPICHCONF_H -I/root/mvapich/mvapich-0.9.9-beta
> -I/root/mvapich/mvapich-0.9.9-beta/mpid/ch_gen2 -I.   -c viacheck.c
> viacheck.c(1036): warning #167: argument of type "unsigned char *" is
> incompatible with parameter of type "char *"
>                  update_crc(1, v->buffer, header->dma_len),
>                                ^
>
> viacheck.c(1557): warning #188: enumerated type mixed with another type
>                          rhandle->protocol);
>                          ^
>
> viacheck.c(1749): warning #188: enumerated type mixed with another type
>                                              rhandle->protocol);
>                                              ^
>
> viacheck.c(2570): error: identifier "IBV_EVENT_CLIENT_REREGISTER" is 
> undefined
>              case IBV_EVENT_CLIENT_REREGISTER:
>                   ^
>

May i request you to provide information about the Openfabrics Gen2 
libraries
which you are using. Typically, OFED libraries are present in 
/usr/local/ofed/{lib, lib64}
depending upon your architecture (32-bit vs 64-bit). The include files 
are present
in /usr/local/ofed/include.

As an example, on our cluster (which has OFED-1.1 installed), the 
verbs.h file present
in the above include location supports IBV_EVENT_CLIENT_REREGISTER event 
(line 200).

Can you please check the verbs.h in the include directory and let us 
know if this
event is supported.

Thanks much,

:- Abhinav
> cm_user.h(6): warning #864: extern inline function
> "odu_test_new_connection" was referenced but not defined
>  inline void odu_test_new_connection(void);
>              ^
>
> compilation aborted for viacheck.c (code 2)
> make[3]: *** [viacheck.o] Error 2
> Exit status from make was 2
> make[2]: *** [mpilib] Error 1
> make[1]: *** [mpi-modules] Error 2
> make: *** [mpi] Error 2
> ---------------------------------------------------------------------------------------------------------- 
>
>
> and the error in gcc:
> ---------------------------------------------------------------------------------------------------------- 
>
> gcc -DHAVE_CONFIG_H -I.
> -I/root/mvapich/mvapich-0.9.9-beta/mpid/ch_gen2
> -I/root/mvapich/mvapich-0.9.9-beta/include
> -I/root/mvapich/mvapich-0.9.9-beta/include
> -I/root/mvapich/mvapich-0.9.9-beta/mpid/ch_gen2
> -I/root/mvapich/mvapich-0.9.9-beta/mpid/util -DMPID_DEVICE_CODE
> -DHAVE_UNAME=1 -DHAVE_NETDB_H=1 -DHAVE_GETHOSTBYNAME=1
> -DMPID_DEBUG_NONE -DMPID_STAT_NONE  -fPIC -D_EM64T_
> -DEARLY_SEND_COMPLETION -DMEMORY_RELIABLE -DVIADEV_RPUT_SUPPORT
> -D_SMP_ -D_SMP_RNDV_ -DCH_GEN2   -I/usr/ofed/include -O3
> -DHAVE_MPICHCONF_H -D_GNU_SOURCE -I/root/mvapich/mvapich-0.9.9-beta
> -I/root/mvapich/mvapich-0.9.9-beta/mpid/ch_gen2 -I.  -Wall  -c
> viacheck.c
> viacheck.c: In function 'viadev_process_recv':
> viacheck.c:1036: warning: pointer targets in passing argument 2 of
> 'update_crc' differ in signedness
> viacheck.c: In function 'async_thread':
> viacheck.c:2570: error: 'IBV_EVENT_CLIENT_REREGISTER' undeclared
> (first use in this function)
> viacheck.c:2570: error: (Each undeclared identifier is reported only once
> viacheck.c:2570: error: for each function it appears in.)
> make[3]: *** [viacheck.o] Error 1
> Exit status from make was 2
> make[2]: *** [mpilib] Error 1
> make[1]: *** [mpi-modules] Error 2
> make: *** [mpi] Error 2
> ----------------------------------------------------------------------------------------------------------------- 
>
>
> What am I doing wrong?
>
> Thanks so much for any help you can give!!!!
>
> -Jen
>
>
>
> On 1/23/07, Sayantan Sur <surs at cse.ohio-state.edu> wrote:
>> Hello Jen,
>>
>> The OSU benchmarks should be ideally run for 2 processes. Can you try
>> osu_latency, osu_bw, osu_bibw with just 2 processes? I have a feeling
>> that the cluster isn't set up quite right, otherwise these simple
>> benchmarks wouldn't fail. Are other users able to run IMB on the 
>> cluster?
>>
>> cpilog might have compilation problems since MPE might have not been
>> compiled in when MVAPICH was built. To enable MPE, use --with-mpe as a
>> configure parameter in make.mvapich.gen2. (assuming you have downloaded
>> MVAPICH-0.9.8 from our website).
>>
>> Similarly, with simpleio, MPIIO component needs to be compiled in when
>> building MVAPICH. Use --with-romio as a configure parameter.
>>
>> Thanks,
>> Sayantan.
>>
>> Aquarijen wrote:
>> > Hi Sayantan,
>> >
>> > Thank you for your help. :)
>> >
>> > A few things about my environment.  The compute nodes are 64 bit, so I
>> > pointed the mvapich compilation to /usr/ofed/lib64 - I have no 32 bit
>> > libs for ofed.  The compute nodes have 2 processors each.  When I have
>> > tried jobs, I have submitted them through pbs (torque) and specified
>> > that I want 1 processor per node - maui enforces this.  I have tried
>> > all my runs with 40 nodes, one processor per node.
>> >
>> > cpi, cpip and hello++ run without problems.
>> >
>> > osu_bw fails with the error:
>> > Connection closed by 172.16.4.36^M
>> > [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
>> > at line 2355 in file viacheck.c
>> > done.
>> >
>> > osu_bcast fails with error:
>> > [39] Abort: [b09n001.oic.ornl.gov:39] Got completion with error, 
>> code=1
>> > at line 2355 in file viacheck.c
>> > done.
>> >
>> > osu_bibw fails with:
>> > [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
>> > at line 2355 in file viacheck.c
>> > [39] Abort: [32] Abort: [24] Abort: [38] Abort:
>> > [b09n001.oic.ornl.gov:39] Got completion with error, code=12, dest
>> > rank=0
>> > at line 397 in file viacheck.c
>> > [b09n008.oic.ornl.gov:32] Got completion with error, code=12, dest 
>> rank=0
>> > at line 397 in file viacheck.c
>> > [b09n016.oic.ornl.gov:24] Got completion with error, code=12, dest 
>> rank=0
>> > at line 397 in file viacheck.c
>> > [b09n002.oic.ornl.gov:38] Got completion with error, code=12, dest 
>> rank=0
>> > at line 397 in file viacheck.c
>> > [36] Abort: [b09n004.oic.ornl.gov:36] Got completion with error,
>> > code=12, dest rank=0
>> > at line 397 in file viacheck.c
>> > done.
>> >
>> > osu_latency fails with:
>> > [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
>> > at line 2355 in file viacheck.c
>> > done.
>> >
>> > I can't compile cpilog.c, I get:
>> > [2vt at b09l02 osu_benchmarks-mvapich]$ which mpicc
>> > /opt/mvapich-gcc-0.9.8/bin/mpicc
>> > [2vt at b09l02 osu_benchmarks-mvapich]$ mpicc cpilog.c -o cpilog
>> > cpilog.o(.text+0xd2): In function `main':
>> > cpilog.c: undefined reference to `MPE_Init_log'
>> > cpilog.o(.text+0xd7):cpilog.c: undefined reference to
>> > `MPE_Log_get_event_number'cpilog.o(.text+0xdf):cpilog.c: undefined
>> > reference to `MPE_Log_get_event_number'cpilog.o(.text+0xe7):cpilog.c:
>> > undefined reference to
>> > `MPE_Log_get_event_number'cpilog.o(.text+0xef):cpilog.c: undefined
>> > reference to `MPE_Log_get_event_number'cpilog.o(.text+0xf7):cpilog.c:
>> > undefined reference to
>> > `MPE_Log_get_event_number'cpilog.o(.text+0xff):cpilog.c: more
>> > undefined references to `MPE_Log_get_event_number' follow
>> > cpilog.o(.text+0x12e): In function `main':
>> > cpilog.c: undefined reference to `MPE_Describe_state'
>> > cpilog.o(.text+0x143):cpilog.c: undefined reference to
>> > `MPE_Describe_state'
>> > cpilog.o(.text+0x158):cpilog.c: undefined reference to
>> > `MPE_Describe_state'
>> > cpilog.o(.text+0x16d):cpilog.c: undefined reference to
>> > `MPE_Describe_state'
>> > cpilog.o(.text+0x1a2):cpilog.c: undefined reference to `MPE_Start_log'
>> > cpilog.o(.text+0x1c0):cpilog.c: undefined reference to `MPE_Log_event'
>> > cpilog.o(.text+0x1f0):cpilog.c: undefined reference to `MPE_Log_event'
>> > cpilog.o(.text+0x202):cpilog.c: undefined reference to `MPE_Log_event'
>> > cpilog.o(.text+0x21e):cpilog.c: undefined reference to `MPE_Log_event'
>> > cpilog.o(.text+0x230):cpilog.c: undefined reference to `MPE_Log_event'
>> > cpilog.o(.text+0x2da):cpilog.c: more undefined references to
>> > `MPE_Log_event' follow
>> > cpilog.o(.text+0x342): In function `main':
>> > cpilog.c: undefined reference to `MPE_Finish_log'
>> > collect2: ld returned 1 exit status
>> >
>> > I also can't compile simpleio.c.  I get:
>> > [2vt at b09l02 osu_benchmarks-mvapich]$ mpicc simpleio.c
>> > simpleio.o(.text+0x252): In function `main':
>> > simpleio.c: undefined reference to `MPI_File_open'
>> > simpleio.o(.text+0x26e):simpleio.c: undefined reference to
>> > `MPI_File_write'
>> > simpleio.o(.text+0x277):simpleio.c: undefined reference to
>> > `MPI_File_close'
>> > simpleio.o(.text+0x2c0):simpleio.c: undefined reference to
>> > `MPI_File_open'
>> > simpleio.o(.text+0x2dc):simpleio.c: undefined reference to
>> > `MPI_File_read'
>> > simpleio.o(.text+0x2e5):simpleio.c: undefined reference to
>> > `MPI_File_close'
>> > collect2: ld returned 1 exit status
>> >
>> > Intel MPI benchmarks (IMB-MPI1) fail with:
>> >
>> > [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=4
>> > at line 2355 in file viacheck.c
>> > done.
>> >
>> >
>> > I'd be happy to provide any opther logs or any other info you might
>> > think would help!  Sorry it took me so long for this - I had a few
>> > fires to put out.  Now, this is #1 priority.
>> >
>> > Thanks for all your help!!!!
>> > Jen
>> >
>> >
>> >
>> > On 1/19/07, Sayantan Sur <surs at cse.ohio-state.edu> wrote:
>> >> Hello Jen,
>> >>
>> >> > I am new.  And a little frusterated. :)
>> >>
>> >> Thanks for your post ... Hope your problems are short-lived :-)
>> >>
>> >> > I have compiled/installed mvapich 0.9.8 using ofed/gen2.
>> >> >
>> >> > I can run cpi on all my nodes just fine.  The problem comes in 
>> when I
>> >> > try to use any of the osu benchmark programs.  They seem to compile
>> >> > just fine, but when I try to run osu_bcast, I get the following 
>> error:
>> >> >
>> >> > [39] Abort: [b09n001.oic.ornl.gov:39] Got completion with error,
>> >> code=1
>> >> > at line 2355 in file viacheck.c
>> >> > done.
>> >> >
>> >> > Where is this viacheck.c?  Has anyone seen this before?  I'd be 
>> happy
>> >> > to provide more details if you tell me what to provide.
>> >>
>> >> viacheck.c is an internal file in the MVAPICH implementation. I 
>> have a
>> >> couple of questions which you could answer ...
>> >>
>> >> 1) How many nodes was this run attempted? I have run osu_bcast on
>> >> 64-nodes/128 processes and it seems to be OK.
>> >>
>> >> 2) Can you run IMB (Intel MPI benchmarks)? They will also call
>> >> MPI_Bcast.
>> >>
>> >> 3) I'm wondering if you could run the other OSU benchmarks, such as
>> >> latency, bandwidth, bi-directional bandwidth?
>> >>
>> >> Thanks,
>> >> Sayantan.
>> >>
>> >> --
>> >> http://www.cse.ohio-state.edu/~surs
>> >>
>> >
>> >
>>
>> -- 
>> http://www.cse.ohio-state.edu/~surs
>>
>>
>
>



More information about the mvapich-discuss mailing list