[mvapich-discuss] Building mvapich2

Jonathan Perkins perkinjo at cse.ohio-state.edu
Wed Feb 19 17:59:58 EST 2014


Hi all.  This issue was resolved by using the rpath approach to
building and linking with libpmi.  David used the following flags
LDFLAGS='-Wl,-rpath,/opt/slurm/default/lib
-Wl,-rpath,/home/drace/mvapich2-2.0b/lib'.

On Wed, Feb 19, 2014 at 1:10 PM, David Race
<david.race at daviddenisefamily.com> wrote:
> Jonathan -
>
> Just to do the build, I added
>
> export LD_LIBRARY_PATH=/opt/slurm/default/lib
>
> Once the configure saw that, I was able to get the configure and make to work,
> but when I did the "make install", I get the following:
>
> export LD_LIBRARY_PATH=/opt/slurm/default/lib:/home/drace/mvapich2-2.0b/lib
>
> I have attached the output of the configure/make/make install.
>
> I have verified that libmpl.so exists in /home/drace/mvapich2-2.0b/lib.  (See
> below:
>
> [drace at apollo1 MPI]$ ls -alF /home/drace/mvapich2-2.0b/lib
> total 72
> drwxrwxr-x 3 drace drace  4096 Feb 19  2014 ./
> drwxrwxr-x 4 drace drace    30 Feb 19  2014 ../
> -rw-r--r-- 1 drace drace 23530 Feb 19  2014 libmpl.a
> -rwxr-xr-x 1 drace drace   908 Feb 19  2014 libmpl.la*
> lrwxrwxrwx 1 drace drace    15 Feb 19  2014 libmpl.so -> libmpl.so.1.0.0*
> lrwxrwxrwx 1 drace drace    15 Feb 19  2014 libmpl.so.1 -> libmpl.so.1.0.0*
> -rwxr-xr-x 1 drace drace 20663 Feb 19  2014 libmpl.so.1.0.0*
> -rw-r--r-- 1 drace drace  4052 Feb 19  2014 libopa.a
> -rwxr-xr-x 1 drace drace   918 Feb 19  2014 libopa.la*
> lrwxrwxrwx 1 drace drace    15 Feb 19  2014 libopa.so -> libopa.so.1.0.0*
> lrwxrwxrwx 1 drace drace    15 Feb 19  2014 libopa.so.1 -> libopa.so.1.0.0*
> -rwxr-xr-x 1 drace drace  7420 Feb 19  2014 libopa.so.1.0.0*
> drwxrwxr-x 2 drace drace    22 Feb 19  2014 pkgconfig/
>
>
> Let me know if there is something else to try; otherwise, I will start
> debugging the scripts.
>
> Best Regards
>
> David Race
>
>
> On Tuesday, February 18, 2014 08:53:22 PM Jonathan Perkins wrote:
>> ./conftest: error while loading shared libraries: libpmi.so.0: cannot
>> open shared object file: No such file or directory
>>
>> This looks like the real error causing other configure tests to give
>> false failures.  I think this is because ld.so isn't looking in the
>> slurm library directory to load libpmi.
>>
>> To get configure to complete successfully you can try one of the
>> following options.
>>
>> - tell ld.so to look in /opt/slurm/default/lib (or lib64 I don't know
>> where libpmi is found) by adding the directory to a file in
>> /etc/ld.so.conf.d and then running ldconfig (Example: echo
>> /opt/slurm/default/lib64 > /etc/ld.so.conf.d/slurm.conf && ldconfig)
>> - use rpath to tell ld.so where to find libpmi (Example: ./configure
>> ... LDFLAGS='-Wl,-rpath,/opt/slurm/default/lib64' ...)
>> - set LD_LIBRARY_PATH to include the location of libpmi (Example:
>> export LD_LIBRARY_PATH=/opt/slurm/default/lib64:$LD_LIBRARY_PATH)
>>
>> I've given the options in the order the order that I think should be
>> tried.  The first option sets up the system to include the slurm
>> library directory to be searched by the default.  The second option is
>> preferred over the third since you only have to set the option up at
>> compile time whereas the third requires LD_LIBRARY_PATH being set at
>> whenever an application is run.
>>
>> On Tue, Feb 18, 2014 at 8:32 PM, David Race
>>
>> <david.race at daviddenisefamily.com> wrote:
>> > Attached is the config.log.
>> >
>> > Let me know if there is something I should try.
>> >
>> > Thanks
>> >
>> > David Race
>> >
>> > On Tuesday, February 18, 2014 02:35:47 PM you wrote:
>> >> Can you send us config.log?  We may be able to pinpoint the cause of
>> >> the issue.  After taking a look at this I may be able to help with the
>> >> slurm issue too.
>> >>
>> >> On Tue, Feb 18, 2014 at 1:13 PM, David Race
>> >>
>> >> <david.race at daviddenisefamily.com> wrote:
>> >> > Jonathan,
>> >> >
>> >> > Thanks for the information.  We are looking at using ethernet without
>> >> > IB,
>> >> > so I have added the --with-device=ch3:nemesis:tcp.  That part seems to
>> >> > be
>> >> > okay.
>> >> >
>> >> > The SLURM part seems to be causing a problem.  We have installed SLURM
>> >> > into
>> >> > /opt/slurm/default, so when I add
>> >> > "--with-pm=no  --with-pmi=slurm   --with-slurm=/opt/slurm/default" to
>> >> > the
>> >> > configure line I get the failure.
>> >> >
>> >> > My failure stays the same:
>> >> >
>> >> > checking for size of MPI_Status... configure: error: unable to compute
>> >> > status size, are you compiling on a non-2s-complement host?
>> >> >
>> >> > I will start digging through the configure script to find the issue.
>> >> >
>> >> > Thanks
>> >> >
>> >> > David Race
>> >> >
>> >> > On Tuesday, February 18, 2014 12:00:34 PM Jonathan Perkins wrote:
>> >> >> Hello.  Are you trying to build MVAPICH2 in shared memory only mode or
>> >> >> do you intend to use your ethernet fabric for inter-node
>> >> >> communication?
>> >> >>
>> >> >> If you intend to use ethernet you will need to select one of the
>> >> >> tcp/ip channels at configure time.
>> >> >> I suggest adding the following `--with-device=ch3:nemesis'
>> >> >>
>> >> >> If you intend to use shared memory only, you can build using our
>> >> >> default channel but you are required to have the infiniband devel
>> >> >> packages installed for the build to complete successfully.
>> >> >>
>> >> >> You can find more information in our userguide:
>> >> >> http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-2.0b.htm
>> >> >> l
>> >> >>
>> >> >> On Tue, Feb 18, 2014 at 11:22 AM, David Race
>> >> >>
>> >> >> <david.race at daviddenisefamily.com> wrote:
>> >> >> > Hello,
>> >> >> >
>> >> >> > I have started building mvapich2-2.0b on a ethernet only system that
>> >> >> > uses
>> >> >> > SLURM as the resource manager.  I am using
>> >> >> >
>> >> >> > ./configure --disable-romio --with-slurm=/opt/slurm/default
>> >> >> > --with-pm=no
>> >> >> > --
>> >> >> > with-pmi=slurm
>> >> >> >
>> >> >> > I get the following error:
>> >> >> >
>> >> >> > checking for size of MPI_Status... configure: error: unable to
>> >> >> > compute
>> >> >> > status size, are you compiling on a non-2s-complement host?
>> >> >> >
>> >> >> > This is an intel processor, so the error message doesn't seem
>> >> >> > correct.
>> >> >> > I
>> >> >> > haven't started the detailed debugging yet.
>> >> >> >
>> >> >> > Has anyone see this error?
>> >> >> >
>> >> >> > Best Regards
>> >> >> >
>> >> >> > David Race
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > mvapich-discuss mailing list
>> >> >> > mvapich-discuss at cse.ohio-state.edu
>> >> >> > http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


More information about the mvapich-discuss mailing list