[mvapich-discuss] Error in initializing MVAPICH2 ptmalloc library

Hari Subramoni subramoni.1 at osu.edu
Thu Jan 23 21:44:17 EST 2014


Hello Hajime,

With certain applications we have seen this happen because of libraries
getting loaded in the wrong order. Under this scenario, disabling
registration cache leads to correctness, however, performance may be
degraded.

However, we believe that these messages should always appear when you run
the application. It is very interesting to note that it happens and
sometimes it does not happen. Could you please send us a reproducer
exhibits the same behavior? This will help us debug the issue on our local
cluster.

Regards,
Hari.


On Thu, Jan 23, 2014 at 4:22 PM, Hajime Fujita <hfujita at uchicago.edu> wrote:

> Dear MVAPICH development team,
>
> Sometimes I see the following warning message at the beginning of the
> MPI program execution.
> ----
> WARNING: Error in initializing MVAPICH2 ptmalloc library.Continuing
> without InfiniBand registration cache support.
> ----
>
> It looks like this message is printed during MPI_Init (actually,
> MPI_Init_thread). As I wrote "sometimes", this does not necessary occur.
>
> Do you have any idea about the cause of this message/consequence to
> performance etc?
>
> Something that may be specific to my program are:
> 1. it uses multi-threading (MPI_THREAD_MULTIPLE)
> 2. it uses shared library, meaning that
>   program -> shared library (which calls MPI) -> MVAPICH2
>
> I'm using the Midway cluster in UChicago.
> http://rcc.uchicago.edu/resources/midway_specs.html
>
> Version information of MVAPICH2 is as follows.
> ----
> [hfujita at midway-login2 tests]$ mpichversion
> MVAPICH2 Version:       2.0b
> MVAPICH2 Release date:  Fri Nov  8 11:17:40 EST 2013
> MVAPICH2 Device:        ch3:mrail
> MVAPICH2 configure:     --prefix=/project/aachien/local/mvapich2-2.0b
> --enable-shared
> MVAPICH2 CC:    gcc    -DNDEBUG -DNVALGRIND -O2
> MVAPICH2 CXX:   g++   -DNDEBUG -DNVALGRIND -O2
> MVAPICH2 F77:   gfortran -L/lib -L/lib   -O2
> MVAPICH2 FC:    gfortran   -O2
> [hfujita at midway-login2 tests]$ mpiexec --version
> HYDRA build details:
>     Version:                                 3.1b1
>     Release Date:                            Fri Nov  8 11:17:40 EST 2013
>     CC:                              gcc
>     CXX:                             g++
>     F77:                             gfortran
>     F90:                             gfortran
>     Configure options:                       '--disable-option-checking'
> '--prefix=/project/aachien/local/mvapich2-2.0b' '--enable-shared'
> '--disable-checkerrors' '--cache-file=/dev/null' '--srcdir=.' 'CC=gcc'
> 'CFLAGS= -DNDEBUG -DNVALGRIND -O2' 'LDFLAGS=-L/lib -L/lib -L/lib
> -Wl,-rpath,/lib -L/lib -Wl,-rpath,/lib -L/lib -L/lib' 'LIBS=-libmad
> -libumad -libverbs -lrt -lhwloc -lpthread -lhwloc ' 'CPPFLAGS=
>
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpid/ch3/channels/mrail/include
>
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpid/ch3/channels/mrail/include
>
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpid/ch3/channels/common/include
>
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpid/ch3/channels/common/include
>
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpid/ch3/channels/mrail/src/gen2
>
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpid/ch3/channels/mrail/src/gen2
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpid/common/locks
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpid/common/locks
> -I/project/aachien/local/src/mvapich2-2.0b/src/util/wrappers
> -I/project/aachien/local/src/mvapich2-2.0b/src/util/wrappers
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpl/include
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpl/include
> -I/project/aachien/local/src/mvapich2-2.0b/src/openpa/src
> -I/project/aachien/local/src/mvapich2-2.0b/src/openpa/src
> -I/project/aachien/local/src/mvapich2-2.0b/src/mpi/romio/include
> -I/include -I/include -I/include -I/include'
>     Process Manager:                         pmi
>     Launchers available:                     ssh rsh fork slurm ll lsf
> sge manual persist
>     Topology libraries available:            hwloc
>     Resource management kernels available:   user slurm ll lsf sge pbs
> cobalt
>     Checkpointing libraries available:
>     Demux engines available:                 poll select
> ----
>
> If you need more information or log, please let me know.
>
>
> Thank you in advance!
> Hajime
>
> --
> Hajime Fujita
> Postdoctoral Scholar, Large-Scale Systems Group
> Department of Computer Science, The University of Chicago
> http://www.cs.uchicago.edu/people/hfujita
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140123/9c380684/attachment.html>


More information about the mvapich-discuss mailing list