[mvapich-discuss] change hosts to restart checkpoint by use of
mvapich2 and blcr
Dhabaleswar Panda
panda at cse.ohio-state.edu
Thu Mar 4 09:46:35 EST 2010
> I want to use mvapich2 and blcr to check point, and Now I
> can check point successfully. However, I want restart the check point
> on other hosts. For example, I run mpi program using mvapich2 on
> host1 and host2, and I save the checkpoint file at a nfs shared path.
> Then I wan to restart the job (cr_restart context.pid) on host3 and
> host4. The 4 host have same hardware and software. I can not solve the
> problem. If I use openmpi to checkpoint.The problem can be solved
> easily (ompi_restart -machinefile ma context.pid)
Thanks for your note. We investigated this issue. Looks like this support
is broken. A fix will be available in the next mvapich2 release. Thanks
for reporting this.
DK
More information about the mvapich-discuss
mailing list