[mvapich-discuss] Error in initializing MVAPICH2 ptmalloc library with mpi4py

Mehmet Belgin mehmet.belgin at oit.gatech.edu
Thu Apr 2 11:46:24 EDT 2015


Thank you for your detailed reply Hari! I know that the researchers were 
using large messages, so we are looking forward to trying 2.2 when it's 
out :)

In the mean time, I will try the workaround you suggested (or can 
temporarily use other stacks like plain mpich3). This is not blocking on 
research at all.

Best,
Mehmet


On 4/2/15 8:29 AM, Hari Subramoni wrote:
> Hello Mehmet,
>
> Sorry to hear that you are hitting performance degradation when 
> registration cache is disabled.
>
> Whether disabling registration cache will have a negative effect on 
> application performance depends entirely on the communication pattern 
> of the application. If the application uses mostly small to medium 
> sized messages (approximately less than 16 KB), then disabling 
> registration cache will mostly have no impact on the performance of 
> the application.
>
> However, if the application uses messages of larger size, then there 
> might be an impact depending on the frequency of communication. If 
> this is the case, then it might be useful to increase the size of the 
> internal communication buffer being used by MVAPICH2 (using the 
> “MV2_VBUF_TOTAL_SIZE” environment variable) and the switch point 
> between eager and rendezvous protocol in MVAPICH2 (using the 
> “MV2_IBA_EAGER_THRESHOLD”) to a larger value. In this scenario, we 
> recommend that you set both to the same value (possibly slightly 
> greater than the median message size being used by your application). 
> Perhaps you can try a value of 128 KB as a starting point. Please 
> refer to the following section userguide for more information.
>
> http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.1rc2-userguide.html#x1-1130009.1.2
>
> We are in the process of coming up with a fix for this issue so that 
> registration cache mechanism works with python codes as well. We hope 
> to have this ready in the MVAPICH2-2.2 series. We will keep you posted 
> on how this proceeds. I hope this temporary work around will be 
> acceptable until then.
>
> Best Regards,
> Hari.
>
> On Wed, Apr 1, 2015 at 6:29 PM, Mehmet Belgin 
> <mehmet.belgin at oit.gatech.edu <mailto:mehmet.belgin at oit.gatech.edu>> 
> wrote:
>
>     Greetings!
>
>     We recently installed mpi4py for python/2.7.9 using intel/15.0 and
>     mvapich/2.1. It works well for a single-node run, but we are
>     getting this warning when using multiple nodes:
>
>     WARNING: Error in initializing MVAPICH2 ptmalloc
>     library.Continuing without InfiniBand registration cache support.
>
>     The code still runs, but displays a really bad performance.
>     Mvapich2 itself is fine when running a regular C/C++/Fortran code,
>     so this issue seems to be triggered by python's mpi4py only.
>
>     I am using this simple test code:
>
>     from mpi4py import MPI
>     size = MPI.COMM_WORLD.Get_size()
>     rank = MPI.COMM_WORLD.Get_rank()
>     print " I  am rank %d of %d"%(rank,size)
>
>     ------------
>
>     # mpirun -np 32 python ./test.py
>     WARNING: Error in initializing MVAPICH2 ptmalloc
>     library.Continuing without InfiniBand registration cache support.
>      I  am rank 14 of 32
>      I  am rank 3 of 32
>      I  am rank 15 of 32
>
>     Does this warning look familiar to anyone? Any suggestions?
>
>     Thanks a lot!
>     -Mehmet
>
>     -- 
>     =========================================
>     Mehmet Belgin, Ph.D. (mehmet.belgin at oit.gatech.edu
>     <mailto:mehmet.belgin at oit.gatech.edu>)
>     Scientific Computing Consultant | OIT - Academic and Research
>     Technologies
>     Georgia Institute of Technology
>     258 4th Street, Rich Building, Room 326
>     Atlanta, GA  30332-0700
>     Office: (404) 385-0665 <tel:%28404%29%20385-0665>
>
>     _______________________________________________
>     mvapich-discuss mailing list
>     mvapich-discuss at cse.ohio-state.edu
>     <mailto:mvapich-discuss at cse.ohio-state.edu>
>     http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>

-- 
=========================================
Mehmet Belgin, Ph.D. (mehmet.belgin at oit.gatech.edu)
Scientific Computing Consultant | OIT - Academic and Research Technologies
Georgia Institute of Technology
258 4th Street, Rich Building, Room 326
Atlanta, GA  30332-0700
Office: (404) 385-0665

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150402/52241cf2/attachment-0001.html>


More information about the mvapich-discuss mailing list