[mvapich-discuss] Problems installing mvapich2/2.3 with Slurm
Peter Kjellström
cap at nsc.liu.se
Wed Mar 13 09:21:57 EDT 2019
On Tue, 12 Mar 2019 16:52:14 -0400
Raghu Reddy <raghu.reddy at noaa.gov> wrote:
...
> mpicc -o hello_c
> /tds_scratch3/SYSADMIN/nesccmgmt/Raghu.Reddy/Testsuite3/hello/hello_mpi_c.c
>
> mpiexec -np 24 ./hello_c
>
> s0014.110678hfi_wait_for_device: The /dev/hfi1_0 device failed to
> appear after 15.0 seconds: Connection timed out
The above message looks for OPA..
On a system with truescale (PSM) and the following relevant psm
packages installed:
$ rpm -qa | grep psm
infinipath-psm-devel-3.0.1-115.1015_open.2_nsc1.el6.x86_64
psmisc-22.6-24.el6.x86_64
infinipath-psm-3.0.1-115.1015_open.2_nsc1.el6.x86_64
NOTE: not psm2
I did:
module load buildenv-intel/2018u1
wget http://.../mvapich2-2.3.1.tar.gz
tar xf mvapich2-2.3.1.tar.gz 1003 cd mvapich2-2.3.1/
./configure --prefix=/home/cap/mpiinst/mvapich2-2.3.1_psm
--with-device=ch3:psm CC=icc CXX=icpc FC=ifort
make -j 8
make install
I works fine both with mpiexec and mpirun in a slurm job using my
choice of hello world:
$ export PATH=/home/cap/mpiinst/mvapich2-2.3.1_psm/bin:$PATH
$ mpicc -o mdrbench_mvp.x mdrbench.c
# in slurm -N2 -n32 job shell
$ unset PSM_RANKS_PER_CONTEXT
$ mpirun ./mdrbench_mvp.x
CPU timing results: iter/us (rank0/mean): 161/161
Setting load to: 0%
1D dim geometry is: 32
...
/Peter K
More information about the mvapich-discuss
mailing list