[mvapich-discuss] MVAPICH2 2.1a: Code stalls on Sandy Bridge, works on Westmere
Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
matthew.thompson at nasa.gov
Mon Jan 5 12:37:19 EST 2015
All,
I'm trying to diagnose an issue that is appearing in a model I work on:
GEOS-5. The problem seems to be architecture-dependent and, most likely,
due to MVAPICH2 (as the same code compiled with Intel MPI 5 and the same
Fortran compiler seems to have no problem).
I can try to go into more detail (for example if I start adding print
statements to find the stall, it can sometimes cure it!), but my first
question is:
Are there environment variables that control architecture-dependent
behaviour of MVAPICH2?
I ask because I saw in the recent MVAPICH2 2.1rc1 announcement:
(NEW) MVAPICH2 2.1rc1 (based on MPICH 3.1.3) with ...
*optimization and tuning for Haswell architecture*
(I tried searching the User's Guide for "Haswell", but no luck. Could
you point me to possible switches?)
Note, also, that this could also not be due to Westmere/Sandy Bridge
tuning, but to the underlying fabric. Here at NCCS, the Westmeres, I
believe, are on DDR interconnects while the Sandy Bridges I was using
are on FDR (which, I think, is actually connected to a QDR main switch)
and some are on QDR.
If I turn on MV2_SHOW_ENV_INFO=2, I see these differences (left, Sandy;
right, Westmere):
>PROCESSOR ARCH NAME : MV2_ARCH_INTEL_XEON_E5_2670_16 | PROCESSOR ARCH NAME : MV2_ARCH_INTEL_XEON_X5650_12
>PROCESSOR MODEL NUMBER : 45 | PROCESSOR MODEL NUMBER : 44
>HCA NAME : MV2_HCA_MLX_CX_FDR | HCA NAME : MV2_HCA_MLX_CX_DDR
>MV2_RDMA_FAST_PATH_BUF_SIZE : 5120 | MV2_RDMA_FAST_PATH_BUF_SIZE : 9216
>MV2_EAGERSIZE_1SC : 8192 | MV2_EAGERSIZE_1SC : 4096
>MV2_SMP_EAGERSIZE : 32769 | MV2_SMP_EAGERSIZE : 65537
>MV2_SMPI_LENGTH_QUEUE : 131072 | MV2_SMPI_LENGTH_QUEUE : 262144
>MV2_SMP_NUM_SEND_BUFFER : 16 | MV2_SMP_NUM_SEND_BUFFER : 32
>MPISPAWN_MPIRUN_HOST : borg01y001 | MPISPAWN_MPIRUN_HOST : borgi117
>MPISPAWN_MPIRUN_ID : 21662 | MPISPAWN_MPIRUN_ID : 23359
>MPISPAWN_NNODES : 6 | MPISPAWN_NNODES : 8
>PMI_PORT : borg01y001:44036 | PMI_PORT : borgi117:37003
>MV2_DEFAULT_MTU : 4 | MV2_DEFAULT_MTU : 3
>MV2_DEFAULT_PKEY : 393216 | MV2_DEFAULT_PKEY : 524288
>MV2_NUM_NODES_IN_JOB : 6 | MV2_NUM_NODES_IN_JOB : 8
Now some of these can be ignored (MPISPAWN, PROCESSOR, etc.), but of the
MV2_ flag differences here, there is an opportunity.
Some testing showed that if we set:
MV2_SMP_NUM_SEND_BUFFER=32
on the Sandy Bridge, the issue was avoided. Huzzah, right? Well, when an
end-user tried it...it hanged for him at some point. So...yeah. Should I
perhaps use all 5 settings from the DDR run?
Any ideas from the experts on why IMPI 5 would not be affected in the
same situation?
Matt
--
Matt Thompson SSAI, Sr Software Test Engr
NASA GSFC, Global Modeling and Assimilation Office
Code 610.1, 8800 Greenbelt Rd, Greenbelt, MD 20771
Phone: 301-614-6712 Fax: 301-614-6246
More information about the mvapich-discuss
mailing list