[mvapich-discuss] Weird "BAD TERMINATION" error when running with BLCR

Raghu rajachan at cse.ohio-state.edu
Wed Feb 5 12:55:33 EST 2014


Arjun,

I just tried this, and things seem fine:
http://pastebin.com/raw.php?i=DmHML9Ja

Are you able to successfully run a MPI job on the two nodes without BLCR
enabled?


Raghu


On Wed, Feb 5, 2014 at 7:57 AM, Arjun J Rao <rectangle.king at gmail.com>wrote:

> I have two 12-core machines in my little mini-cluster.  Installed MVAPICH
> on both with the --enable-ckpt option. Both machines can do passwordless
> logins to each other. Also, inserted the BLCR kernel module so that lsmod
> shows blcr has been installed.
> After compiling my "Hello World from many processes" MPI program and
> running it on 1 machine, I get fine output. But on running it on 2
> machines, I get the following error :
>
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   EXIT CODE: 11
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
> [proxy:0:0 at abc3] HYD_pmcd_pmip_control_cmd_cb (pm/pmiserv/pmip_cb.c:902):
> assert (!closed) failed
> [proxy:0:0 at abc3] HYDT_dmxu_poll_wait_for_event
> (tools/demux/demux_poll.c:76): callback returned error status
> [proxy:0:0 at abc3] main (pm/pmiserv/pmip.c:206): demux engine error waiting
> for event
> [mpiexec at abc3] HYDT_bscu_wait_for_completion
> (tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated
> badly; aborting
> [mpiexec at abc3] HYDT_bsci_wait_for_completion
> (tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for
> completion
> [mpiexec at abc3] HYD_pmci_wait_for_completion
> (pm/pmiserv/pmiserv_pmci.c:218): launcher returned error waiting for
> completion
> [mpiexec at abc3] main (ui/mpich/mpiexec.c:331): process manager error
> waiting for completion
>
>
> Frustrating.
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140205/d8436393/attachment.html>


More information about the mvapich-discuss mailing list