[mvapich-discuss] 2.3.5-1 performance
    Subramoni, Hari 
    subramoni.1 at osu.edu
       
    Mon Dec 21 13:23:10 EST 2020
    
    
  
Hi, Lana.
The default configuration will ensure that IB is selected. Can you rerun the app after setting MV2_SHOW_ENV_INFO=3 with MVAPICH2 2.3.4 and MVAPICH2 2.3.5-1? MVAPICH2 will print a lot of information at the beginning. That is what we will be looking for.
Can you let us know what MPI calls are taking more time and at what scale?
In the meantime, can you please try the following environment variable combinations to see if any of those help?
  1.  MV2_ENABLE_AFFINITY=0
  2.  MV2_HYBRID_ENABLE_THRESHOLD=<nprocs+1>
Best,
Hari.
From: mvapich-discuss-bounces at cse.ohio-state.edu <mvapich-discuss-bounces at mailman.cse.ohio-state.edu> On Behalf Of Lana Deere
Sent: Monday, December 21, 2020 1:16 PM
To: mvapich-discuss at cse.ohio-state.edu <mvapich-discuss at mailman.cse.ohio-state.edu>
Subject: [mvapich-discuss] 2.3.5-1 performance
The runs I have been doing with 2.3.5-1 to try to reproduce the SHMEM_Sync bus error are having a new issue, specifically they are running much slower than they did with 2.3.1 and 2.3.4.  As best I've managed to measure it, under 2.3.5-1 it's spending 2x-4x as much time in the MPI calls as in previous versions.  The only difference between the slower and the faster runs is the copy of libmpi.so.12.1.1 which is available to the program -- I can swap in the 2.3.4 libmpi.so without rebuilding the program and I get the performance back.
There is one configure difference, namely --enable-fast=O2,ndebug on 2.3.5-1 vs. --enable-ast=O3,ndebug on the other versions.
The first thing I thought of was that maybe it had decided to select Ethernet rather than InfiniBand for the transport, but there seems to be a lot of InfiniBand traffic at the correct times when the program is running.  Is there some way to get MPI to output explicitly the transport it selects, just to double check?
Are there any changes in 2.3.5-1 which seem like they might cause the performance difference?  Any environment variables which might need to be set or set differently than before?
Thanks.
.. Lana (lana.deere at gmail.com<mailto:lana.deere at gmail.com>)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20201221/d08ce651/attachment-0001.html>
    
    
More information about the mvapich-discuss
mailing list