[mvapich-discuss] MVAPICH2 2.1a, PMI2 interface, SLURM srun parallel luncher

Jonathan Perkins perkinjo at cse.ohio-state.edu
Wed Oct 15 14:09:07 EDT 2014


On Wed, Oct 15, 2014 at 06:30:16PM +0100, Filippo Spiga wrote:
> Dear MVAPICH people,
> 
> I am experimenting with the new PMI2 interface recently added to MVAPICH2 2.1a. My use case is quite simple: 
> 1. I build MV2 with MPI2 support (as the documentation says)
> 2. I compile an application using mpiXXX wrappers enabling both MPI and OpenMP
> 3. I want to run the parallel application using srun and its "--cpu_bind" option
> 
> Here what I did:
> 
> 1)
> ./configure CC=icc CXX=icpc F77=ifort FC=ifort --prefix=/usr/local/Cluster-Users/fs395/mvapich2/2.1a/intel --with-device=ch3:mrail --with-rdma=gen2 --enable-rdma-cm --disable-blcr  --enable-threads=default --enable-shared --enable-sharedlibs=gcc --enable-cxx --enable-fc --enable-f77 --enable-g=none --enable-fast --with-limic2=$LIMIC2_ROOT --with-limic2-include=$LIMIC2_ROOT/include --with-limic2-libpath=$LIMIC2_ROOT/lib --enable-romio --with-hwloc --without-cuda --with-slurm=/usr/local/Cluster-Apps/slurm-test --with-slurm-include=/usr/local/Cluster-Apps/slurm-test/include --with-slurm-lib=/usr/local/Cluster-Apps/slurm-test/lib64 --with-pmi=pmi2 --with-pm=slurm
> 
> 2)
> 
> [fs395 at login-sand3 PW-AUSURF112-K_MV2]$ ldd pw-omp-mv2.x
>         linux-vdso.so.1 =>  (0x00007fff81fff000)
>         libmkl_intel_lp64.so => /usr/local/Cluster-Apps/intel/mkl/10.3.10.319/composer_xe_2011_sp1.10.319/mkl/lib/intel64/libmkl_intel_lp64.so (0x00007f6649da2000)
>         libmkl_intel_thread.so => /usr/local/Cluster-Apps/intel/mkl/10.3.10.319/composer_xe_2011_sp1.10.319/mkl/lib/intel64/libmkl_intel_thread.so (0x00007f6648d23000)
>         libmkl_core.so => /usr/local/Cluster-Apps/intel/mkl/10.3.10.319/composer_xe_2011_sp1.10.319/mkl/lib/intel64/libmkl_core.so (0x00007f6647cac000)
>         libmpifort.so.12 => /usr/local/Cluster-Users/fs395/mvapich2/2.1a/intel/lib/libmpifort.so.12 (0x00007f6647a75000)
>         libmpi.so.12 => /usr/local/Cluster-Users/fs395/mvapich2/2.1a/intel/lib/libmpi.so.12 (0x00007f6647108000)
>         libpmi2.so.0 => /usr/local/Cluster-Apps/slurm-test/lib/libpmi2.so.0 (0x00007f6646ef0000)
>         libm.so.6 => /lib64/libm.so.6 (0x00007f6646c41000)
>         libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f6646a24000)
>         libc.so.6 => /lib64/libc.so.6 (0x00007f6646690000)
>         libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f6646479000)
>         libdl.so.2 => /lib64/libdl.so.2 (0x00007f6646275000)
>         libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x00007f664606c000)
>         libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x00007f6645e63000)
>         libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x00007f6645b10000)
>         libibmad.so.5 => /usr/lib64/libibmad.so.5 (0x00007f66458f4000)
>         librdmacm.so.1 => /usr/lib64/librdmacm.so.1 (0x00007f66456e0000)
>         libibumad.so.3 => /usr/lib64/libibumad.so.3 (0x00007f66454d9000)
>         libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x00007f66452c6000)
>         librt.so.1 => /lib64/librt.so.1 (0x00007f66450bd000)
>         liblimic2.so.0 => /usr/local/Cluster-Apps/limic2/0.5.6/lib/liblimic2.so.0 (0x00007f6644ebc000)
>         libifport.so.5 => /usr/local/Cluster-Apps/intel/fce/12.1.10.319/lib/intel64/libifport.so.5 (0x00007f6644d87000)
>         libifcore.so.5 => /usr/local/Cluster-Apps/intel/fce/12.1.10.319/lib/intel64/libifcore.so.5 (0x00007f6644b42000)
>         libimf.so => /usr/local/Cluster-Apps/intel/fce/12.1.10.319/lib/intel64/libimf.so (0x00007f6644777000)
>         libintlc.so.5 => /usr/local/Cluster-Apps/intel/fce/12.1.10.319/lib/intel64/libintlc.so.5 (0x00007f6644628000)
>         libsvml.so => /usr/local/Cluster-Apps/intel/fce/12.1.10.319/lib/intel64/libsvml.so (0x00007f6643ead000)
>         /lib64/ld-linux-x86-64.so.2 (0x00007f664a58a000)
>         libz.so.1 => /lib64/libz.so.1 (0x00007f6643c97000)
>         libnl.so.1 => /lib64/libnl.so.1 (0x00007f6643a44000)
> 
> 3) 
> 
> #!/bin/bash
> #SBATCH -J TESTING
> #SBATCH -A SUPPORT-GPU
> #SBATCH --qos=support-gpu
> #SBATCH --nodes=4
> #SBATCH --ntasks=8
> #SBATCH --time=04:00:00
> #SBATCH --no-requeue
> #SBATCH -p tesla
> 
> numnodes=$SLURM_JOB_NUM_NODES
> numtasks=$SLURM_NTASKS
> mpi_tasks_per_node=$(echo "$SLURM_TASKS_PER_NODE" | sed -e  's/^\([0-9][0-9]*\).*$/\1/')
> 
> . /etc/profile.d/modules.sh                # Leave this line (enables the module command)
> module purge
> module load default-wilkes
> module load slurm-test
> module unload intel/impi cuda intel/mkl intel/cce intel/fce intel/impi
> module load intel/fce/14.0.3.174
> module load intel/cce/14.0.3.174
> module load intel/mkl/11.1.3.174
> module load fs395/mvapich2/2.1a/intel
> 
> workdir="$SLURM_SUBMIT_DIR"
> cd $workdir
> 
> export OMP_NUM_THREADS=6
> 
> srun --cpu_bind=v,rank_ldom --mpi=pmi2 ./<my-app>
> 
> 
> What I expect is that the application is bind to socket and OpenMP threads (6 in total for each MPI) are limited within the selected socket. it looks like the binding is "by core", despite "--cpu_bind=v,rank_ldom" is specified. I notice this because on a compiute node hwloc tells me so...
> 
> [fs395 at tesla121 ~]$ hwloc-ps
> 4054    L2Cache:0               /scratch/fs395/QE-TESTS/20141015_SRUN/PW-AUSURF112-K_MV2/././pw
> 4055    L2Cache:7               /scratch/fs395/QE-TESTS/20141015_SRUN/PW-AUSURF112-K_MV2/././pw
> 
> what I would like to see is something like this:
> 
> [fs395 at tesla128 ~]$ hwloc-ps
> 40419   NUMANode:0              /scratch/fs395/QE-TESTS/20141015_SRUN/PW-AUSURF112-K_HPCX/././p
> 40420   NUMANode:1              /scratch/fs395/QE-TESTS/20141015_SRUN/PW-AUSURF112-K_HPCX/././p
> 
> 
> exactly what both Open MPI and Intel MPI do (and I always using srun). Am I doing something wrong? Something is missing?

Hi Filippo.  Can you try adding the following to your batch script?

    export MV2_ENABLE_AFFINITY=O

MVAPICH2 enables affinity by default and I think there is an interaction
that is preventing it from doing what you're expecting.

-- 
Jonathan Perkins


More information about the mvapich-discuss mailing list