[mvapich-discuss] change hosts to restart checkpoint by use of mvapich2 and blcr

Dhabaleswar Panda panda at cse.ohio-state.edu
Thu Mar 4 09:46:35 EST 2010


>           I want to use mvapich2 and blcr to check point, and Now I
> can check point successfully. However, I want restart the check point
> on other hosts.  For example, I run mpi program using mvapich2 on
> host1 and host2, and I save the checkpoint file at a nfs shared path.
> Then I wan to restart the job (cr_restart context.pid) on host3 and
> host4. The 4 host have same hardware and software. I can not solve the
> problem. If I use openmpi to checkpoint.The problem can be solved
> easily (ompi_restart -machinefile ma context.pid)

Thanks for your note. We investigated this issue. Looks like this support
is broken. A fix will be available in the next mvapich2 release. Thanks
for reporting this.

DK



More information about the mvapich-discuss mailing list