[mvapich-discuss] BLCR checkpoint support
Sourav Chakraborty
chakraborty.52 at buckeyemail.osu.edu
Mon Aug 10 12:45:57 EDT 2015
Hi Maksym,
Thanks for providing the mpiname output.
Regarding your query, by default MVAPICH2 uses high-performance shared
memory based channels for intra-node communication instead of TCP/IP
sockets. MVAPICH2 also takes care of setting up the intra-node
communication channels after restart automatically. We are investigating
the issue and will get back to you soon.
Thanks,
Sourav
On Mon, Aug 10, 2015 at 8:53 AM, Maksym Planeta <
mplaneta at os.inf.tu-dresden.de> wrote:
> Thank you for you reaction, the output of mpiname is following:
> $ ~/opt/bin/mpiname -a
> MVAPICH2 2.1 Fri Apr 03 20:00:00 EDT 2015 ch3:mrail
>
> Compilation
> CC: gcc -DNDEBUG -DNVALGRIND -O2
> CXX: g++ -DNDEBUG -DNVALGRIND -O2
> F77: gfortran -L/lib -L/lib -O2
> FC: gfortran -O2
>
> Configuration
> --prefix=/home/planeta/opt --enable-fortran=all --enable-ckpt
>
> I was thinking about this issue and got following thoughts. I run this
> benchmark on a single machine with the only ethernet network card. Thus,
> most likely different ranks use sockets to communicate with each other.
> BLCR manual says, that it does not record socket state and the
> application itself should restore socket based connection. I pretty much
> doubt that NAS benchmarks try to do this (they seem to be unaware of
> BLCR). This means that when a rank processor is cloned, its socket
> connection is broken, and hence computation cannot continue. Could it be
> the case?
>
> And if yes, could you suggest an application or, more preferably, a
> benchmark, which can work with either BLCR or other kind of
> checkpoint/restart or migration framework?
>
>
> On 08/10/2015 05:20 PM, Sourav Chakraborty wrote:
>
>> Hi Maksym,
>>
>> Thanks for the note. We are investigating the issue. Can you please let
>> us know the version and configuration of MVAPICH2 you are using? The
>> output from mpiname -a would be helpful.
>>
>> Thanks,
>> Sourav Chakraborty
>>
>>
>> On Mon, Aug 10, 2015 at 5:11 AM, Maksym Planeta
>> <mplaneta at os.inf.tu-dresden.de <mailto:mplaneta at os.inf.tu-dresden.de>>
>>
>> wrote:
>>
>> Hello,
>>
>> I'm trying to find out if BLCR is still working with MVAPICH2. I
>> have installed debian on my 4-core machine. To test how checkpoints
>> work I compiled an lu benchmark from NAS performance benchmark
>> suite. But unfortunately the application always fails while doing
>> checkpoints. I see only the first checkpoint created, but I didn't
>> manage to restart the application from my checkpoint.
>>
>> Could you acknowledge please, that the checkpoint/restart mechanism
>> is still supposed to work, because the latest release of BLCR is
>> from 2013?
>>
>> And if yes, could you please tell me what am I doing wrong?
>>
>> The details:
>>
>> # uname -a
>> Linux planeta7 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1+deb8u2
>> (2015-07-17) x86_64 GNU/Linux
>>
>> blcr version: blcr-0.8.6~b3 this is the version available in debian
>> experimental repository
>>
>> NPB application: lu class C nprocs 4
>>
>> MVAPICH is compiled from scratch and it is installed in ~/opt
>> directory
>>
>> How I start the application:
>>
>> $ MV2_CKPT_NO_SYNC=1 ~/opt/bin/mpiexec -np 4 -verbose
>> -ckpoint-interval 120 -ckpoint-prefix /tmp/chkpt/ ./bin/lu.C.4
>>
>> How it fails (note that after first checkpoint completed, time steps
>> do not advance):
>>
>> Time step 80
>> [proxy:0:0 at planeta7] requesting checkpoint
>> [proxy:0:0 at planeta7] checkpoint completed
>> [proxy:0:0 at planeta7] requesting checkpoint
>> [proxy:0:0 at planeta7] checkpoint completed
>> [proxy:0:0 at planeta7] requesting checkpoint
>> [proxy:0:0 at planeta7] HYDT_ckpoint_checkpoint
>> (tools/ckpoint/ckpoint.c:115): Previous checkpoint has not
>> completed.[proxy:0:0 at planeta7] HYD_pmcd_pmip_control_cmd_cb
>> (pm/pmiserv/pmip_cb.c:931): checkpoint suspend failed
>> [proxy:0:0 at planeta7] HYDT_dmxu_poll_wait_for_event
>> (tools/demux/demux_poll.c:76): callback returned error status
>> [proxy:0:0 at planeta7] main (pm/pmiserv/pmip.c:206): demux engine
>> error waiting for event
>> [mpiexec at planeta7] control_cb (pm/pmiserv/pmiserv_cb.c:200): assert
>> (!closed) failed
>> [mpiexec at planeta7] HYDT_dmxu_poll_wait_for_event
>> (tools/demux/demux_poll.c:76): callback returned error status
>> [mpiexec at planeta7] HYD_pmci_wait_for_completion
>> (pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
>> [mpiexec at planeta7] main (ui/mpich/mpiexec.c:344): process manager
>> error waiting for completion
>>
>> Cpu model : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
>> With 4 CPU cores
>>
>> dmesg output:
>>
>> # dmesg -c
>> [425658.981022] blcr: warning: skipped a socket.
>> [425658.981031] blcr: warning: skipped a socket.
>> [425658.981036] blcr: warning: skipped a socket.
>> [425658.981063] blcr: warning: skipped a socket.
>> [425660.135281] blcr: warning: skipped a socket.
>> [425661.269592] blcr: warning: skipped a socket.
>> [425662.698799] blcr: warning: skipped a socket.
>> [425896.865836] blcr: chkpt_watchdog: 'lu.C.4' (tgid/pid 2589/2589)
>> exited with signal 9 during checkpoint
>> [425896.865839] blcr: chkpt_watchdog: 'lu.C.4' (tgid/pid 2589/2599)
>> exited with signal 9 during checkpoint
>> [425896.865841] blcr: chkpt_watchdog: 'lu.C.4' (tgid/pid 2591/2591)
>> exited with signal 9 during checkpoint
>> [425896.865842] blcr: chkpt_watchdog: 'lu.C.4' (tgid/pid 2591/2598)
>> exited with signal 9 during checkpoint
>> [425896.881038] blcr: warning: skipped a socket.
>> [425896.881042] blcr: warning: skipped a socket.
>> [425896.881043] blcr: warning: skipped a socket.
>> [425896.881051] blcr: warning: skipped a socket.
>> [425896.881149] blcr: cr_freeze_threads failed (-4)
>> [425898.414107] blcr: warning: skipped a socket.
>>
>> Complete log of mpiexec:
>>
>> http://paste.debian.net/290916/
>>
>> I also tried to run the application with mpiexec.mpirun_rsh, but the
>> behavior there was pretty similar. If you think it is worth to show
>> the results of mpiexec.mpirun_rsh, tell me please.
>>
>> --
>> With best regards,
>> Maksym Planeta.
>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> <mailto:mvapich-discuss at cse.ohio-state.edu>
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150810/01c24d7f/attachment-0001.html>
More information about the mvapich-discuss
mailing list