[mvapich-discuss] [BUG] Wrong error message with BLCR
Maksym Planeta
mplaneta at os.inf.tu-dresden.de
Tue Aug 18 12:34:29 EDT 2015
Hello,
I found another bug with BLCR. When checkpoint support is not compiled
in, but the user still specifies arguments related to checkpoints, hydra
prints following error message, when checkpoint creation is attempted:
$ mpiexec -ckpoint-interval 10 -ckpoint-prefix /home/planeta/opt/chkpt
-np 4 -hosts 172.31.128.50,172.31.128.51 ./bin/lu.B.4
...
[proxy:0:1 at os-dhcp017] requesting checkpoint
[proxy:0:1 at os-dhcp017] checkpoint completed
[proxy:0:0 at planeta-ib1] requesting checkpoint
[proxy:0:0 at planeta-ib1] checkpoint completed
[proxy:0:1 at os-dhcp017] HYDT_ckpoint_blcr_checkpoint
(tools/ckpoint/blcr/ckpoint_blcr.c:241): Checkpointing failed. Make
sure BLCR kernel module is loaded. Unknown error 2356
[proxy:0:1 at os-dhcp017] ckpoint_thread (tools/ckpoint/ckpoint.c:76): blcr
checkpoint returned error
[proxy:0:0 at planeta-ib1] HYDT_ckpoint_blcr_checkpoint
(tools/ckpoint/blcr/ckpoint_blcr.c:241): Checkpointing failed. Make
sure BLCR kernel module is loaded. Unknown error 2356
[proxy:0:0 at planeta-ib1] ckpoint_thread (tools/ckpoint/ckpoint.c:76):
blcr checkpoint returned error
Hydra complains that there is no BLCR module found, but in fact it
should either complain on wrong arguments or on the absence of
checkpoint support.
$ mpiname -a
MVAPICH2 2.1 Fri Apr 03 20:00:00 EDT 2015 ch3:nemesis
Compilation
CC: gcc -DNDEBUG -DNVALGRIND -O2
CXX: g++ -DNDEBUG -DNVALGRIND -O2
F77: gfortran -O2
FC: gfortran -O2
Configuration
--prefix=/home/planeta/opt/apps/mvapich/2.1 --enable-fortran=all
--with-device=ch3:nemesis:ib
--
Regards,
Maksym Planeta
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5154 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150818/3a675dd3/attachment-0001.p7s>
More information about the mvapich-discuss
mailing list