[mvapich-discuss] Re: [openfabrics-ewg] Announcing the release of MVAPICH2 0.9.8 with Checkpoint/Restart, iWARP, RDMA CM-based connection manageme

Shaun Rowland rowland at cse.ohio-state.edu
Wed Nov 15 19:18:50 EST 2006


Sundeep Narravula wrote:

>> Please see the following for the files in /usr/local/lib directory:
>>

<snip>

>> -rwxr-xr-x  1 root root    837 Nov  8 19:04 librdmacm.la
>> -rwxr-xr-x  1 root root  54472 Nov  8 19:04 librdmacm.so
>> [root at ammasso1 lib]#

In addition to my previous suggestions, another thing to try is looking
at what the runtime linker's cache says:

[rowland at j3-gen2 lib64]$ strings /etc/ld.so.cache |grep librdma
librdmacm.so
/usr/local/ofed/lib64/librdmacm.so

You can also try updating the links with something like (with
/usr/local/lib instead):

[rowland at j2-gen2 lib64]$ sudo ldconfig -v -n /usr/local/ofed/lib64
/usr/local/ofed/lib64:
         libibdm.so.1 -> libibdm.so.1.1.1
         librdmacm.so -> librdmacm.so.0.9.0
         libosmcomp.so.1 -> libosmcomp.so.1.0.1
         libibcm.so -> libibcm.so.0.9.0
         libosmvendor.so.2 -> libosmvendor_openib.so
         libibverbs.so.1 -> libibverbs.so.1.0.0
         libibmad.so.1 -> libibmad.so.1.0.0
         libdaplcma.so.1 -> libdaplcma.so.1.0.2
         libibumad.so.1 -> libibumad.so.1.0.0
         libdat.so.1 -> libdat.so.1.0.2
         libopensm.so.1 -> libopensm.so.1.1.0
         libibcommon.so.1 -> libibcommon.so.1.0.0
         libdaplscm.so.1 -> libdaplscm.so.1.0.2
         libibdmcom.so.1 -> libibdmcom.so.1.1.1

In the above case I deleted the librdmacm.so symlink and it was relinked
to librdmacm.so.0.9.0 as expected. Since your librdmacm.so file is not
actually there with its "full name", I don't think that will work. I am
not sure how to determine exactly the version number of the shared
library itself directly. When I test on our systems, I don't find
"0.9.0" anywhere. The library won't necessarily be in the cache unless
the path /usr/local/lib has already been added to the default system
search path I believe. In our OFED installations, these things are in
/usr/local/ofed/lib64 and automatically added to the ld.so.conf system
so they will be in the default search path. The command above implies
the -N option, not to rebuild the cache - only to make the links correctly.

You can check some options in the ldconfig manual page. It does make
links can update the cache file. However, I don't know that this will
fix your problem or help without actually being on the system itself. I
mention this because ldconfig can make such links correctly in most
cases. In this one, given some testing I've done, I don't think it will
help _without_ renaming the librdmacm.so file first _if_ that's the real
problem.

In any case, the original library installation should have resulted in
this being done correctly as far as I am aware. This might be
supplementary information useful once the output of "ldd" and "objdump
-x" are examined.
-- 
Shaun Rowland	rowland at cse.ohio-state.edu
http://www.cse.ohio-state.edu/~rowland/


More information about the mvapich-discuss mailing list