[mvapich-discuss] viacheck.c error?
    Sayantan Sur 
    surs at cse.ohio-state.edu
       
    Tue Jan 23 23:23:48 EST 2007
    
    
  
Hello Jen,
The OSU benchmarks should be ideally run for 2 processes. Can you try 
osu_latency, osu_bw, osu_bibw with just 2 processes? I have a feeling 
that the cluster isn't set up quite right, otherwise these simple 
benchmarks wouldn't fail. Are other users able to run IMB on the cluster?
cpilog might have compilation problems since MPE might have not been 
compiled in when MVAPICH was built. To enable MPE, use --with-mpe as a 
configure parameter in make.mvapich.gen2. (assuming you have downloaded 
MVAPICH-0.9.8 from our website).
Similarly, with simpleio, MPIIO component needs to be compiled in when 
building MVAPICH. Use --with-romio as a configure parameter.
Thanks,
Sayantan.
Aquarijen wrote:
> Hi Sayantan,
>
> Thank you for your help. :)
>
> A few things about my environment.  The compute nodes are 64 bit, so I
> pointed the mvapich compilation to /usr/ofed/lib64 - I have no 32 bit
> libs for ofed.  The compute nodes have 2 processors each.  When I have
> tried jobs, I have submitted them through pbs (torque) and specified
> that I want 1 processor per node - maui enforces this.  I have tried
> all my runs with 40 nodes, one processor per node.
>
> cpi, cpip and hello++ run without problems.
>
> osu_bw fails with the error:
> Connection closed by 172.16.4.36^M
> [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
> at line 2355 in file viacheck.c
> done.
>
> osu_bcast fails with error:
> [39] Abort: [b09n001.oic.ornl.gov:39] Got completion with error, code=1
> at line 2355 in file viacheck.c
> done.
>
> osu_bibw fails with:
> [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
> at line 2355 in file viacheck.c
> [39] Abort: [32] Abort: [24] Abort: [38] Abort:
> [b09n001.oic.ornl.gov:39] Got completion with error, code=12, dest
> rank=0
> at line 397 in file viacheck.c
> [b09n008.oic.ornl.gov:32] Got completion with error, code=12, dest rank=0
> at line 397 in file viacheck.c
> [b09n016.oic.ornl.gov:24] Got completion with error, code=12, dest rank=0
> at line 397 in file viacheck.c
> [b09n002.oic.ornl.gov:38] Got completion with error, code=12, dest rank=0
> at line 397 in file viacheck.c
> [36] Abort: [b09n004.oic.ornl.gov:36] Got completion with error,
> code=12, dest rank=0
> at line 397 in file viacheck.c
> done.
>
> osu_latency fails with:
> [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
> at line 2355 in file viacheck.c
> done.
>
> I can't compile cpilog.c, I get:
> [2vt at b09l02 osu_benchmarks-mvapich]$ which mpicc
> /opt/mvapich-gcc-0.9.8/bin/mpicc
> [2vt at b09l02 osu_benchmarks-mvapich]$ mpicc cpilog.c -o cpilog
> cpilog.o(.text+0xd2): In function `main':
> cpilog.c: undefined reference to `MPE_Init_log'
> cpilog.o(.text+0xd7):cpilog.c: undefined reference to
> `MPE_Log_get_event_number'cpilog.o(.text+0xdf):cpilog.c: undefined
> reference to `MPE_Log_get_event_number'cpilog.o(.text+0xe7):cpilog.c:
> undefined reference to
> `MPE_Log_get_event_number'cpilog.o(.text+0xef):cpilog.c: undefined
> reference to `MPE_Log_get_event_number'cpilog.o(.text+0xf7):cpilog.c:
> undefined reference to
> `MPE_Log_get_event_number'cpilog.o(.text+0xff):cpilog.c: more
> undefined references to `MPE_Log_get_event_number' follow
> cpilog.o(.text+0x12e): In function `main':
> cpilog.c: undefined reference to `MPE_Describe_state'
> cpilog.o(.text+0x143):cpilog.c: undefined reference to 
> `MPE_Describe_state'
> cpilog.o(.text+0x158):cpilog.c: undefined reference to 
> `MPE_Describe_state'
> cpilog.o(.text+0x16d):cpilog.c: undefined reference to 
> `MPE_Describe_state'
> cpilog.o(.text+0x1a2):cpilog.c: undefined reference to `MPE_Start_log'
> cpilog.o(.text+0x1c0):cpilog.c: undefined reference to `MPE_Log_event'
> cpilog.o(.text+0x1f0):cpilog.c: undefined reference to `MPE_Log_event'
> cpilog.o(.text+0x202):cpilog.c: undefined reference to `MPE_Log_event'
> cpilog.o(.text+0x21e):cpilog.c: undefined reference to `MPE_Log_event'
> cpilog.o(.text+0x230):cpilog.c: undefined reference to `MPE_Log_event'
> cpilog.o(.text+0x2da):cpilog.c: more undefined references to
> `MPE_Log_event' follow
> cpilog.o(.text+0x342): In function `main':
> cpilog.c: undefined reference to `MPE_Finish_log'
> collect2: ld returned 1 exit status
>
> I also can't compile simpleio.c.  I get:
> [2vt at b09l02 osu_benchmarks-mvapich]$ mpicc simpleio.c
> simpleio.o(.text+0x252): In function `main':
> simpleio.c: undefined reference to `MPI_File_open'
> simpleio.o(.text+0x26e):simpleio.c: undefined reference to 
> `MPI_File_write'
> simpleio.o(.text+0x277):simpleio.c: undefined reference to 
> `MPI_File_close'
> simpleio.o(.text+0x2c0):simpleio.c: undefined reference to 
> `MPI_File_open'
> simpleio.o(.text+0x2dc):simpleio.c: undefined reference to 
> `MPI_File_read'
> simpleio.o(.text+0x2e5):simpleio.c: undefined reference to 
> `MPI_File_close'
> collect2: ld returned 1 exit status
>
> Intel MPI benchmarks (IMB-MPI1) fail with:
>
> [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=4
> at line 2355 in file viacheck.c
> done.
>
>
> I'd be happy to provide any opther logs or any other info you might
> think would help!  Sorry it took me so long for this - I had a few
> fires to put out.  Now, this is #1 priority.
>
> Thanks for all your help!!!!
> Jen
>
>
>
> On 1/19/07, Sayantan Sur <surs at cse.ohio-state.edu> wrote:
>> Hello Jen,
>>
>> > I am new.  And a little frusterated. :)
>>
>> Thanks for your post ... Hope your problems are short-lived :-)
>>
>> > I have compiled/installed mvapich 0.9.8 using ofed/gen2.
>> >
>> > I can run cpi on all my nodes just fine.  The problem comes in when I
>> > try to use any of the osu benchmark programs.  They seem to compile
>> > just fine, but when I try to run osu_bcast, I get the following error:
>> >
>> > [39] Abort: [b09n001.oic.ornl.gov:39] Got completion with error, 
>> code=1
>> > at line 2355 in file viacheck.c
>> > done.
>> >
>> > Where is this viacheck.c?  Has anyone seen this before?  I'd be happy
>> > to provide more details if you tell me what to provide.
>>
>> viacheck.c is an internal file in the MVAPICH implementation. I have a
>> couple of questions which you could answer ...
>>
>> 1) How many nodes was this run attempted? I have run osu_bcast on
>> 64-nodes/128 processes and it seems to be OK.
>>
>> 2) Can you run IMB (Intel MPI benchmarks)? They will also call
>> MPI_Bcast.
>>
>> 3) I'm wondering if you could run the other OSU benchmarks, such as
>> latency, bandwidth, bi-directional bandwidth?
>>
>> Thanks,
>> Sayantan.
>>
>> -- 
>> http://www.cse.ohio-state.edu/~surs
>>
>
>
-- 
http://www.cse.ohio-state.edu/~surs
    
    
More information about the mvapich-discuss
mailing list