[mvapich-discuss] Unable to build mvapich2 1.2p1 for use with totalview

Craig Tierney Craig.Tierney at noaa.gov
Tue Sep 29 12:38:18 EDT 2009


Jonathan Perkins wrote:
> Hi Craig.  We haven't seen this type of issue come up before.  We'll
> take a look to see if we can reproduce this issue.  In the meantime can
> you try  building while removing the -L/usr/lib64 and -L/lib64 options
> from your compiler variables.  If you really need these options you
> should add them to the CFLAGS variable.
> 
> Can you also try leaving the --with-pm option unset (this allows for mpd
> and mpirun_rsh).  We perform the majority of our testing with mpirun_rsh
> as it scales and performs better than the other pm options.
> 
> Let us know if any of these actions allows your build to proceed
> successfully.
> 

Sorry about the delay.  The reason I include the flags above (which can
be done in LDFLAGS) is so that I don't have to fight the OS with the library
search path and have it try the 32-bit libraries first, which causes warnings.
Removing that doesn't fix the problem.

The error I reported doesn't seem to exist in mvapich2-1.4rc2.

However, the problem that I really had is that I cannot debug with totalview.
When I start the debugger, it tries to debug mpirun_rsh, not the application.

I built mvapich2-1.4rc2 as:

./configure LDFLAGS=-L/usr/lib64 -L/lib64 CC=icc CXX=icpc F77=ifort FC=ifort
 F90=ifort --with-ib-libpath=/usr/lib64 --with-ib-include=/usr/include --prefix=
/opt/hjet/mvapich2/1.4rc2-intel --enable-romio=yes --with-file-system=lustre --e
nable-g=dbg --enable-sharedlibs=gcc --enable-debuginfo

And I have the following environment variables set:

TVDSVRLAUNCHCMD=ssh
TOTALVIEW=/opt/toolworks/totalview.8.6.2-0/bin/totalview

But my problem is that Totalview is debugging mpirun_rsh, not the
MPI application.

I am launching the program with:

$MPICH/bin/mpirun_rsh -hostfile $MACHINE_FILE -tv -np 8 ./osu_alltoall

The program does run, but it just isn't debugged.

Thanks,
Craig



> On Mon, Sep 14, 2009 at 01:46:46PM -0600, Craig Tierney wrote:
>> I am trying to build Mvapich2 1.2p1 so that I can use
>> Totalview.  The docs say that I am supposed to add the
>> following options:
>>
>> --enable-g=dbg --enable-sharedlibs=gcc --enable-debuginfo
>>
>> My complete configure line is:
>>
>> ./configure CC="icc -L/usr/lib64 -L/lib64" CXX="icpc -L/usr/lib64 -L/lib64" F77="ifort -L/usr/lib64 -L/lib64" FC="ifort -L/usr/lib64 -L/lib64" F90="ifort -L/usr/lib64 -L/lib64" \
>>         --with-ib-libpath=/usr/lib64 \
>>         --with-ib-include=/usr/include \
>>         --prefix=/opt/hjet/mvapich2/1.2p1-intel \
>>         --enable-romio=yes --with-file-system=lustre \
>>         --with-pm=remshell \
>>         --enable-g=dbg --enable-sharedlibs=gcc --enable-debuginfo \
>>         --enable-threads=multiple
>>
>> The problem is that when it tries to link the system tools, it fails.
>> For example, when it tries to link mpiexec, I get:
>>
>> icc -L/usr/lib64 -L/lib64 -g -L/usr/lib64  -static  -o mpiexec mpiexec.o  -L../util \
>>         -lmpiexec -L../../../lib -L/home/admin/hjet/software/opt/mvapich/mvapich2-1.2p1/lib -lmpich -lpthread  -lrdmacm -libverbs -libumad    -lrt
>> ../util/libmpiexec.a(pmiport.o): In function `MPIE_GetMyHostName':
>> /misc/whome/admin/hjet/software/opt/mvapich/mvapich2-1.2p1/src/pm/util/pmiport.c:200: warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the
>> glibc version used for linking
>> /usr/lib64/libc.a(malloc.o): In function `__malloc_check_init':
>> (.text+0xb00): multiple definition of `__malloc_check_init'
>> ../../../lib/libmpich.a(mvapich_malloc.o):/misc/whome/admin/hjet/software/opt/mvapich/mvapich2-1.2p1/src/mpid/ch3/channels/mrail/src/memory/ptmalloc2/hooks.c:83: first defined here
>> ld: Warning: size of symbol `__malloc_check_init' changed from 122 in ../../../lib/libmpich.a(mvapich_malloc.o) to 105 in /usr/lib64/libc.a(malloc.o)
>> /usr/lib64/libc.a(malloc.o): In function `_int_free':
>> (.text+0x21f0): multiple definition of `_int_free'
>> ../../../lib/libmpich.a(mvapich_malloc.o):/misc/whome/admin/hjet/software/opt/mvapich/mvapich2-1.2p1/src/mpid/ch3/channels/mrail/src/memory/ptmalloc2/mvapich_malloc.c:4307: first defined here
>> ld: Warning: size of symbol `_int_free' changed from 778 in ../../../lib/libmpich.a(mvapich_malloc.o) to 2413 in /usr/lib64/libc.a(malloc.o)
>> /usr/lib64/libc.a(malloc.o): In function `_int_malloc':
>> (.text+0x2b60): multiple definition of `_int_malloc'
>>

>> ........
>>
>> And problems with IB:
>>
>> (.text+0xbb): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
>> /usr/lib64/libibverbs.a(src_libibverbs_la-verbs.o): In function `ibv_create_comp_channel':
>> (.text+0x9b6): undefined reference to `pthread_mutex_trylock'
>> /usr/lib64/libibumad.a(libibumad_la-umad.o): In function `umad_get_fd':
>> (.text+0xdc): undefined reference to `ibwarn'
>> /usr/lib64/libibumad.a(libibumad_la-umad.o): In function `umad_done':
>> (.text+0x10d): undefined reference to `ibwarn'
>> /usr/lib64/libibumad.a(libibumad_la-umad.o): In function `umad_addr_dump':
>>
>>
>> This doesn't happen when the binary is linked dynamically (remove -static).
>>
>> Am I missing an option from getting everything built cleanly?
>>
>> Thanks,
>> Craig
>>
>> -- 
>> Craig Tierney (craig.tierney at noaa.gov)
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss


-- 
Craig Tierney (craig.tierney at noaa.gov)


More information about the mvapich-discuss mailing list