[mvapich-discuss] Mvapich fault toleance features

Rui Wang wangraying at gmail.com
Mon Feb 6 03:02:58 EST 2012


Hi all,

 

I saw on the website that the latest version MVAPICH2 1.8 has included
MPICH2 1.4.1p1. As far as I know, MPICH2-1.4 supports some fault tolerance
features, that if I kill one process of a task, the whole task will not
abort and none of the communication operations will hang. I have done some
experiments to verify these features. However, when I did the same
experiments using MVAPICH2 1.8, it does not support such features. 

 

I'm writing to enquire that does MVAPICH2 1.8 have plan to support these
features, if so, how soon? It seems user-driven fault tolerance is becoming
an alternative solution to address fault tolerance issues for the future.

 

Thanks,

 

Rui 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120206/80938e3c/attachment.html


More information about the mvapich-discuss mailing list