[mvapich-discuss] GROMACS: Memory Allocation (mv2)/Segmentation Fault (mv2-gdr)

Le, Viet Duc vdle at moasys.com
Wed Sep 9 22:40:12 EDT 2020


Dear Kawthar,

Thanks for taking time to test and confirm the issue with gromacs.
We tested with 2019.6 (gcc/4.8.5) but the same error was observed with the
latest version 2020.3 (gcc/8.3.0) as you stated.
Regression further to the 2016.4 version didn't help either.

Unfortunately, tuning MV2_CUDA_BLOCK_SIZE didn't circumvent the issue with
both ivy bridge and skylake cpus in our disposal.
The failing rate is highest if you lower the cpu:gpu ratio, for instance
8:1 is the minimum for benchRIB.tpr.
I will reach out to GROMACS forum regarding their memory allocation
routine.

$ srun --mpi=list
srun: MPI types are...
srun: none
srun: openmpi
srun: pmi2

Using 'pmi2' explicitly, the following error was observed in addition to
segmentation fault:
srun: error: eio_message_socket_accept: slurm_receive_msg[10.151.0.7]: Zero
Bytes were transmitted or received
>From gdb:
#0  0x0000000001d13b68 in debug ()
#1  0x00002ba0bbf37161 in ?? ()
#2  0x00002ba0bbfecd3c in ?? ()
#3  0x0000000000000000 in ?? ()
So the backtrace is not really helpful. For now, we settled with
without-slurm rpm.

Regards.
Viet-Duc

On Tue, Sep 8, 2020 at 5:39 AM Shafie Khorassani, Kawthar <
shafiekhorassani.1 at buckeyemail.osu.edu> wrote:

> Hi Viet-Duc,
>
> We were able to reproduce the "Not enough memory" issue you were seeing
> using MVAPICH2 with GROMACS2020.3 and GCC/8.3.0. We were only able to
> reproduce this on x-86 based systems with skylake. Can you set the
> following at run-time and let us know if you are able to resolve the memory
> issue: MV2_CUDA_BLOCK_SIZE=8388608? We were however unable to reproduce
> the segfault you were seeing at startup with MVAPICH2-GDR + srun. Can you
> let us know what version of pmi you are using here with the MVAPICH2-GDR
> run (i.e. PMIv1 or PMIv2)?
>
>
> Thank you,
>
>
> Kawthar Shafie Khorassani
>
>
> ________________________________________
> From: mvapich-discuss-bounces at cse.ohio-state.edu <
> mvapich-discuss-bounces at mailman.cse.ohio-state.edu> on behalf of Le, Viet
> Duc <vdle at moasys.com>
> Sent: Friday, August 28, 2020 2:18 AM
> To: mvapich-discuss at cse.ohio-state.edu
> Subject: [mvapich-discuss] GROMACS: Memory Allocation
> (mv2)/Segmentation        Fault (mv2-gdr)
>
> Hello,
>
> When testing the latest version of mvapich2/mvapich2-gdr (2.3.4) with
> gromacs (2019.6), we encounter two peculiar issues.
> Below are our setups and build environments. We hope it may help with
> reproduction of the issues.
>
> [hardwares]
> - Xeon Gold 6230 (Skylake)
> - 2 x Tesla V100 (PIX connection)
>
> [software]
> - CentOS Linux release 7.4.1708
> - slurm 18.08.6
> - gcc/4.8.5, cuda/10.1
> - MLNX_OFED_LINUX-4.4-2.0.7.0
> - mvapich2: ./configure --with-pm=slurm --with-pmi=pmi2
> --with-slurm=/usr/local --enable-cuda  --with-cuda=/apps/cuda/10.1
> - mvapich2-gdr:
> mvapich2-gdr-mcast.cuda10.1.mofed4.4.gnu4.8.5.slurm-2.3.4-1.el7.x86_64.rpm
> (from mavpich2 homepage)
> - reference: openmpi (3.1.5)
>
> [gromacs] 2019.6 is the last version that can be built with gcc/4.8.5
> $ tar xzvf gromacs-2019.6.tar.gz
> $ cd gromacs-2019.6
> $ mkdir build
> $ cd build
> $ cmake ..  -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx
> -DGMX_SIMD=AVX_512 -DGMX_MPI=on -DGMX_CUDA_TARGET_SM=70
> -DGMX_BUILD_OWN_FFTW=ON
> $ make
> The resulting binary-gmx_mpi-locates in ./bin directory under gromacs
> source directory
>
> [input files]
> Inputs are taken from MPIBPC:
> https://urldefense.com/v3/__https://www.mpibpc.mpg.de/grubmueller/bench__;!!KGKeukY!hgu0Vg0uu6t400HHmOXEAupYG3k6fJsS_yrDh1DYJCBQlYb3lzBkrz2iWNb5VaGjMVlr8SucZEJ-kwtiUoy_$ <https://urldefense.com/v3/__https://www.mpibpc.mpg.de/grubmueller/bench__;!!KGKeukY!i_gABUkdkRdEYB6Cnvnd49rxJzT4bSbl59ig571AQPAATalPTR7R7wKYAHzsNupGXqSbbks4nwKCtqY$>
> (benchRIB, 2 M atoms, ribosome in water)
>
> [job scripts]
> Important MV2_* variables such as MV2_USE_CUDA/MV2_USE_GDRCOPY are
> properly set via environment modules.
> >>> begin of slurm script
> #!/usr/bin/env bash
> #SBATCH --partition=skl_v100_2
> #SBATCH --nodes=1
> #SBATCH --ntasks-per-node=8
> #SBATCH --gres=gpu:2
> #SBATCH --job-name=test-mv2
> #SBATCH --error=%j.stderr
> #SBATCH --output=%j.stdout
> #SBATCH --time=24:00:00
> #SBATCH --comment=gromacs
>
> # use gromacs internal affinity setting.
> export MV2_ENABLE_AFFINITY=0
>
> module load gcc/4.8.5 cuda/10.1
> module load cudampi/mvapich2-2.3.4 # or cudampi/mvapich2-gdr-2.3.4,
> respectively.
>
> srun gmx_mpi mdrun -s ./benchRIB.tpr -nsteps 2000 -notunepme -noconfout
> -pin on -v
> <<< end of slurm script
>
> [mvapich2-2.3.4 error: failure to allocate small memory]
> >>> begin of error message
> Source file: src/gromacs/utility/smalloc.cpp (line 226)
> MPI rank:    3 (out of 8)
>
> Fatal error:
> Not enough memory. Failed to realloc 308080 bytes for nbs->cell,
> nbs->cell=5206b8d0
> (called from file [...]/nbnxn_grid.cpp, line 1502)
> <<< end of error message
> Descriptions of error:
> - Crash randomly when random MPI rank fails to allocate memory. Jobs do
> run sometimes, making this error unpredictable.
> - The input benchRIB.tpr is rather small, taking up only 10 GB on host
> Skylake and about 1.5 GB per GPU, as shown from the attached file.
> - If memory is truly insufficient, gromacs will return the above message
> with a very large negative value, for example: 'Failed to reallocate
> -12415232232 bytes...'
> - OpenMPI works reliably without issue. Thus we think that there is a
> memory allocation issue related to mvapich2
>
> [mvapich2-gdr-2.3.4 error: srun segmentation]
> >>> begin of error message
> [gpu31:mpi_rank_2][error_sighandler] Caught error: Segmentation fault
> (signal 11)
> srun: error: gpu31: tasks 0-7: Segmentation fault (core dumped)
> <<< end of error message
> Description of error:
> - Slurm job crashes immediately at startup. Srun does not play well with
> mvapich2-gdr.
>
> The two issues above were also observed when using the latest version of
> gromacs (2020.3) and gcc/8.3.0
> We appreciate your insights into this matter.
>
> Regards.
> Viet-Duc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20200910/86a9b051/attachment.html>


More information about the mvapich-discuss mailing list