[mvapich-discuss] 2.3.5-1 performance

Lana Deere lana.deere at gmail.com
Mon Dec 21 13:15:54 EST 2020


 The runs I have been doing with 2.3.5-1 to try to reproduce the SHMEM_Sync
bus error are having a new issue, specifically they are running much slower
than they did with 2.3.1 and 2.3.4.  As best I've managed to measure it,
under 2.3.5-1 it's spending 2x-4x as much time in the MPI calls as in
previous versions.  The only difference between the slower and the faster
runs is the copy of libmpi.so.12.1.1 which is available to the program -- I
can swap in the 2.3.4 libmpi.so without rebuilding the program and I get
the performance back.

There is one configure difference, namely --enable-fast=O2,ndebug on
2.3.5-1 vs. --enable-ast=O3,ndebug on the other versions.

The first thing I thought of was that maybe it had decided to select
Ethernet rather than InfiniBand for the transport, but there seems to be a
lot of InfiniBand traffic at the correct times when the program is
running.  Is there some way to get MPI to output explicitly the transport
it selects, just to double check?

Are there any changes in 2.3.5-1 which seem like they might cause the
performance difference?  Any environment variables which might need to be
set or set differently than before?

Thanks.

.. Lana (lana.deere at gmail.com)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20201221/0cec7d8c/attachment.html>


More information about the mvapich-discuss mailing list