[mvapich-discuss] Runtime warning - Error in initializing MVAPICH2 ptmalloc library

Jonathan Perkins perkinjo at cse.ohio-state.edu
Wed Feb 24 17:10:33 EST 2016


Thank you for providing this.  I'll see if my build matches up once I get
my hands on a fedora environment.

As far as RPM goes, I wasn't actually asking for the RPM file itself, I was
asking for the name of the RPM.  We provide multiple RPMs (X, GDR, EA,
etc.) and I wanted to be sure we were debugging the correct code/build.
Your output of mpiname is sufficient so you do not need to send any more
info at this time.

On Wed, Feb 24, 2016 at 4:40 PM Nenad Vukicevic <nenad at intrepid.com> wrote:

> I can provide you with RPM, but it is pretty much the same what you
> provided for RHEL6.  I just rebuilt it on Fedora.
>
>
>
> On the other hand, the latest runs are all done from the locally built
> tree.  Here is the output form mpiname:
>
>
>
> MVAPICH2 2.2b Mon Nov 12 20:00:00 EST 2015 ch3:mrail
>
>
>
> Compilation
>
> CC: gcc    -DNDEBUG -DNVALGRIND -g -O2
>
> CXX: g++   -DNDEBUG -DNVALGRIND -g -O2
>
> F77: gfortran -L/lib -L/lib   -g -O2
>
> FC: gfortran   -g -O2
>
>
>
> Configuration
>
> --prefix=/usr/local/mvapich-debug --enable-g=all
> --enable-error-messages=all
>
>
>
> And ldd:
>
>
>
> [nenad at dev one-sided]$ ldd osu_acc_latency
>
>         linux-vdso.so.1 (0x00007ffff7ffd000)
>
>         libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007ffff7d69000)
>
>         libm.so.6 => /usr/lib64/libm.so.6 (0x00007ffff7a66000)
>
>         libmpi.so.12 => /usr/local/mvapich-debug/lib/libmpi.so.12
> (0x00007ffff72f7000)
>
>         libc.so.6 => /usr/lib64/libc.so.6 (0x00007ffff6f36000)
>
>         /lib64/ld-linux-x86-64.so.2 (0x0000555555554000)
>
>         libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x00007ffff6d2a000)
>
>         libudev.so.1 => /usr/lib64/libudev.so.1 (0x00007ffff6d09000)
>
>         libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0
> (0x00007ffff6aff000)
>
>         libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x00007ffff6795000)
>
>         libibmad.so.5 => /usr/lib64/libibmad.so.5 (0x00007ffff657b000)
>
>         librdmacm.so.1 => /usr/lib64/librdmacm.so.1 (0x00007ffff6365000)
>
>         libibumad.so.3 => /usr/lib64/libibumad.so.3 (0x00007ffff615c000)
>
>         libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x00007ffff5f49000)
>
>         libdl.so.2 => /usr/lib64/libdl.so.2 (0x00007ffff5d45000)
>
>         librt.so.1 => /usr/lib64/librt.so.1 (0x00007ffff5b3c000)
>
>         libgfortran.so.3 => /usr/lib64/libgfortran.so.3
> (0x00007ffff5810000)
>
>         libgcc_s.so.1 => /usr/lib64/libgcc_s.so.1 (0x00007ffff55f9000)
>
>         libquadmath.so.0 => /usr/lib64/libquadmath.so.0
> (0x00007ffff53b9000)
>
>         libselinux.so.1 => /usr/lib64/libselinux.so.1 (0x00007ffff5196000)
>
>         libresolv.so.2 => /usr/lib64/libresolv.so.2 (0x00007ffff4f7a000)
>
>         libdw.so.1 => /usr/lib64/libdw.so.1 (0x00007ffff4d30000)
>
>         libcap.so.2 => /usr/lib64/libcap.so.2 (0x00007ffff4b2b000)
>
>         libz.so.1 => /usr/lib64/libz.so.1 (0x00007ffff4915000)
>
>         liblzma.so.5 => /usr/lib64/liblzma.so.5 (0x00007ffff46ee000)
>
>         libnl-route-3.so.200 => /usr/lib64/libnl-route-3.so.200
> (0x00007ffff4488000)
>
>         libnl-3.so.200 => /usr/lib64/libnl-3.so.200 (0x00007ffff4267000)
>
>         libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007ffff3ff6000)
>
>         libelf.so.1 => /usr/lib64/libelf.so.1 (0x00007ffff3dde000)
>
>         libbz2.so.1 => /usr/lib64/libbz2.so.1 (0x00007ffff3bcd000)
>
>         libattr.so.1 => /usr/lib64/libattr.so.1 (0x00007ffff39c7000)
>
>
>
>
>
> *From:* Jonathan Perkins [mailto:perkinjo at cse.ohio-state.edu]
> *Sent:* Wednesday, February 24, 2016 12:28 PM
> *To:* Nenad Vukicevic <nenad at intrepid.com>
> *Cc:* mvapich-discuss at cse.ohio-state.edu
> *Subject:* Re:
>
>
>
> Thanks for trying this out.  I believe you said that this was with Fedora
> 23 while using one of our RPMs.  Can you share with us the RPM used and the
> output of mpiname -a?  Can you also send us the output of ldd
> osu_acc_latency?  We'll try to reproduce this issue and think of a
> workaround.
>
>
>
> On Wed, Feb 24, 2016 at 12:54 PM Nenad Vukicevic <nenad at intrepid.com>
> wrote:
>
> I got the same result.
>
>
>
> [nenad at dev one-sided]$ mpirun -n 2 osu_acc_latency
>
> WARNING: Error in initializing MVAPICH2 ptmalloc library.Continuing
> without InfiniBand registration cache support.
>
> # OSU MPI_Accumulate latency Test v5.2
>
> # Window creation: MPI_Win_allocate
>
> # Synchronization: MPI_Win_flush
>
> # Size          Latency (us)
>
> 0                       0.14
>
> 1                       3.50
>
> 2                       3.12
>
> 4                       3.01
>
> 8                       3.99
>
> 16                      4.02
>
> 32                      4.04
>
> 64                      4.14
>
> 128                     4.79
>
> 256                     5.08
>
> 512                     5.50
>
> 1024                    6.42
>
> 2048                    7.51
>
> 4096                    8.93
>
> 8192                   13.53
>
> 16384                  26.76
>
> 32768                  43.19
>
> 65536                 193.44
>
> 131072                261.60
>
> 262144                397.56
>
> 524288                670.97
>
> 1048576              1214.61
>
> 2097152              2692.48
>
> 4194304              5265.03
>
>
>
> On Wed, Feb 24, 2016 at 8:22 AM, Nenad Vukicevic <nenad at intrepid.com>
> wrote:
>
> We are not doing anything special,  the warning appears on a simple hello
> program. I have MVAPICh built from 2.2b with and without debugging, and
> also with RPM created from the RHEL 6 spec.  Note that this is fairly new
> Fedora (FC23).
>
>
>
> I'll try OMB for warnings.
>
>
>
>
>
> On Wed, Feb 24, 2016 at 7:28 AM, Jonathan Perkins <
> perkinjo at cse.ohio-state.edu> wrote:
>
> Hello Nenad.  Can you tell us more about your application?  Specifically
> if there are any libraries or special handling of memory allocation that
> may be linked in.  This type of problem usually occurs if MVAPICH2 isn't
> able to properly intercept the malloc and free calls.
>
> If you don't believe that your application is doing anything like this,
> can you verify that you're able to run the OMB suite (such as osu_latency)
> without this warning being emitted.
>
>
>
> On Wed, Feb 24, 2016 at 3:11 AM Nenad Vukicevic <nenad at intrepid.com>
> wrote:
>
> X-MS-Exchange-CrossTenant-FromEntityHeade
> --===============3181330637561286136==
> Content-Type: multipart/alternative;
> boundary="089e0112c5cce95adb052c794745"
>
> --089e0112c5cce95adb052c794745
> Content-Type: text/plain; charset="UTF-8"
>
> We are running mvapich 2.2b on Fedora 23. We are getting the following
> warning when running the code:
>
> WARNING: Error in initializing MVAPICH2 ptmalloc library.Continuing without
> InfiniBand registration cache support.
>
> I saw some previous discussions on the subject but none of the suggested
> solutions worked (there was a patch on 2.1a(?), plus LD_PRELAOD of mpich
> library, etc..).
>
> This error shows up on a simple hello test but only if we run on multiple
> nodes.  I can run multiple threads on the same node without causing this
> warning.
>
> Any idea what we can try?  I understand that there will be a slight
> decrease in performance if we disable ptmalloc.
>
> --
> Nenad
>
> --089e0112c5cce95adb052c794745
> Content-Type: text/html; charset="UTF-8"
> Content-Transfer-Encoding: quoted-printable
>
> <div dir=3D"ltr">We are running mvapich 2.2b on Fedora 23. We are getting
> t=
> he following warning when running the code:<div>
>
>
>
>
>
>
>
> <p class=3D""><span class=3D"">WARNING: Error in initializing MVAPICH2
> ptma=
> lloc library.Continuing without InfiniBand registration cache
> support.</spa=
> n></p><p class=3D"">I saw some previous discussions on the subject but
> none=
>  of the suggested solutions worked (there was a patch on 2.1a(?), plus
> LD_P=
> RELAOD of mpich library, etc..).</p><p class=3D"">This error shows up on a
> =
> simple hello test but only if we run on multiple nodes.=C2=A0 I can run
> mul=
> tiple threads on the same node without causing this warning.</p><p
> class=3D=
> "">Any idea what we can try?=C2=A0 I understand that there will be a
> slight=
>  decrease in performance if we disable ptmalloc.</p><div><br></div>--
> <br><=
> div class=3D"gmail_signature"><div dir=3D"ltr"><div><div
> dir=3D"ltr">Nenad<=
> /div></div></div></div>
> </div></div>
>
> --089e0112c5cce95adb052c794745--
>
> --===============3181330637561286136==
> Content-Type: text/plain; charset="us-ascii"
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
> --===============3181330637561286136==--
>
>
>
>
>
> --
>
> Nenad
>
>
>
>
>
> --
>
> Nenad
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160224/9954ebf3/attachment-0001.html>


More information about the mvapich-discuss mailing list