[mvapich-discuss] Migration for MPI processes

wei huang huanwei at cse.ohio-state.edu
Sun Apr 6 10:19:22 EDT 2008


Hi Maya,

This is doable. Currently such functionality is supported in mvapich2.
Apparently, to do that your checkpoint images should be readable from the
new node. Please let us know if you meet any issues here.

Thanks.

Regards,
Wei Huang

774 Dreese Lab, 2015 Neil Ave,
Dept. of Computer Science and Engineering
Ohio State University
OH 43210
Tel: (614)292-8501


On Sun, 6 Apr 2008, Maya Khaliullina wrote:

> Hi all,
>
> We have an infiniband cluster:
> Node: 2xQuad Core Intel Xeon 2.33 GHz
> O/S: RHEL4.5
> File System: GPFS
> We are using MVAPICH2-1.0.2p1 with BLCR-0.6.5.
> At this moment we have no problems with C/R(everything works fine).
>
> I wonder could the MPI job be restarted after a checkpointing on another
> subset of nodes,
> i.e. could the migration for MPI processes be realized from a node on
> another one?
> If not so, will you support this capability in the future? Thanks.
>
> Maya
>



More information about the mvapich-discuss mailing list