[mvapich-discuss] mvapich2+slurm+blcr Oh, My!
David Brown
dmlb2000 at gmail.com
Wed Jun 23 15:45:18 EDT 2010
So I'm not sure who to ask this question to and what setup I'm doing
wrong, but something isn't working right.
I built mvapich2 1.5rc2 this way:
./configure \
--prefix=%{mpidir} \
--mandir=%{mpidir}/man \
--enable-error-checking=runtime \
--enable-timing=none \
--enable-g=mem,dbg,meminit \
--enable-sharedlibs=gcc \
--with-rdma=gen2 \
--enable-romio \
--with-file-system=lustre+nfs \
--with-slurm=/usr \
--with-pmi=slurm \
--with-pm=no \
--enable-threads=multiple \
--with-thread-package=pthreads \
--disable-mpe \
--without-mpe \
--disable-nmpi-as-mpi \
--enable-f77 \
--enable-f90 \
--enable-cxx \
--enable-blcr
I built slurm this way:
rpmbuild -ta --with blcr --with postgresql slurm-2.1.9.tar.bz2
I configured slurm with:
CheckpointType=checkpoint/blcr
And I built IOR and launched it this way:
$ srun --checkpoint-dir=/lustre -n 8 -N 4 ./IOR -i 4 -b 32g -T 600 -E
-k -e -t 1m -o /lustre/testFile
[Rank 1][cr.c: line 781]MV2_CKPT_MPD_BASE_PORT is not set
[Rank 3][cr.c: line 781]MV2_CKPT_MPD_BASE_PORT is not set
[Rank 0][cr.c: line 781]MV2_CKPT_MPD_BASE_PORT is not set
[Rank 2][cr.c: line 781]MV2_CKPT_MPD_BASE_PORT is not set
[Rank 5][cr.c: line 781]MV2_CKPT_MPD_BASE_PORT is not set
[Rank 4][cr.c: line 781]MV2_CKPT_MPD_BASE_PORT is not set
[Rank 7][cr.c: line 781]MV2_CKPT_MPD_BASE_PORT is not set
[Rank 6][cr.c: line 781]MV2_CKPT_MPD_BASE_PORT is not set
srun: error: x11: tasks 2-3: Exited with exit code 255
srun: error: x10: tasks 0-1: Exited with exit code 255
srun: error: x12: tasks 4-5: Exited with exit code 255
srun: error: x13: tasks 6-7: Exited with exit code 255
So I'm confused, this is obviously not working. Do I need to use some
other mechanism to launch? am I missing configuration somewhere?
Thanks,
- David Brown
More information about the mvapich-discuss
mailing list