[mvapich-discuss] Error in initializing MVAPICH2 ptmalloc library with mpi4py

Hari Subramoni subramoni.1 at osu.edu
Thu Apr 2 08:29:26 EDT 2015


Hello Mehmet,

Sorry to hear that you are hitting performance degradation when
registration cache is disabled.

Whether disabling registration cache will have a negative effect on
application performance depends entirely on the communication pattern of
the application. If the application uses mostly small to medium sized
messages (approximately less than 16 KB), then disabling registration cache
will mostly have no impact on the performance of the application.

However, if the application uses messages of larger size, then there might
be an impact depending on the frequency of communication. If this is the
case, then it might be useful to increase the size of the internal
communication buffer being used by MVAPICH2 (using the
“MV2_VBUF_TOTAL_SIZE” environment variable) and the switch point between
eager and rendezvous protocol in MVAPICH2 (using the
“MV2_IBA_EAGER_THRESHOLD”) to a larger value. In this scenario, we
recommend that you set both to the same value (possibly slightly greater
than the median message size being used by your application). Perhaps you
can try a value of 128 KB as a starting point. Please refer to the
following section userguide for more information.

http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.1rc2-userguide.html#x1-1130009.1.2

We are in the process of coming up with a fix for this issue so that
registration cache mechanism works with python codes as well. We hope to
have this ready in the MVAPICH2-2.2 series. We will keep you posted on how
this proceeds. I hope this temporary work around will be acceptable until
then.

Best Regards,
Hari.

On Wed, Apr 1, 2015 at 6:29 PM, Mehmet Belgin <mehmet.belgin at oit.gatech.edu>
wrote:

> Greetings!
>
> We recently installed mpi4py for python/2.7.9 using intel/15.0 and
> mvapich/2.1. It works well for a single-node run, but we are getting this
> warning when using multiple nodes:
>
> WARNING: Error in initializing MVAPICH2 ptmalloc library.Continuing
> without InfiniBand registration cache support.
>
> The code still runs, but displays a really bad performance. Mvapich2
> itself is fine when running a regular C/C++/Fortran code, so this issue
> seems to be triggered by python's mpi4py only.
>
> I am using this simple test code:
>
> from mpi4py import MPI
> size = MPI.COMM_WORLD.Get_size()
> rank = MPI.COMM_WORLD.Get_rank()
> print " I  am rank %d of %d"%(rank,size)
>
> ------------
>
> # mpirun -np 32 python ./test.py
> WARNING: Error in initializing MVAPICH2 ptmalloc library.Continuing
> without InfiniBand registration cache support.
>  I  am rank 14 of 32
>  I  am rank 3 of 32
>  I  am rank 15 of 32
>
> Does this warning look familiar to anyone? Any suggestions?
>
> Thanks a lot!
> -Mehmet
>
> --
> =========================================
> Mehmet Belgin, Ph.D. (mehmet.belgin at oit.gatech.edu)
> Scientific Computing Consultant | OIT - Academic and Research Technologies
> Georgia Institute of Technology
> 258 4th Street, Rich Building, Room 326
> Atlanta, GA  30332-0700
> Office: (404) 385-0665
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150402/d45164b5/attachment-0002.html>


More information about the mvapich-discuss mailing list