[mvapich-discuss] hydra errors

Jonathan Perkins perkinjo at cse.ohio-state.edu
Sun Jun 3 19:05:45 EDT 2012


Thanks for the report Walid.  Can you tell us the version of MVAPICH2
being used and whether or not this is reproduceable with the
OSU-Micro-Benchmarks?  Providing the configuration options used to build
MVAPICH2 as well as architecture information of the machines this is
being run on may be helpful as well.

On Sun, Jun 03, 2012 at 01:56:55PM +0300, Walid wrote:
> Dear all,
> 
> One of the users have reported that almost all of his jobs die when he run
> using mvapich2, below are the error messages, he is using a simple call:
> 
>                                 mpirun program program options  > output
> file
> 
> I have asked him to use --stdout=output file, and mpiexec.hydra, he did not
> come back to me yet with if it was successful or not, however i wanted to
> see if these errors were seen before or not
> 
> 
> [mpiexec at plci340] stdoe_cb (./ui/utils/uiu.c:309): assert (!closed) failed
> 
> porgram.err:[mpiexec at plci340] control_cb (./pm/pmiserv/pmiserv_cb.c:306):
> error in the UI defined callback
> 
> H.err:[mpiexec at plci340] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:77): callback returned error status
> SW.err:[mpiexec at plci340] HYD_pmci_wait_for_completion
> (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
> SW.err:[mpiexec at plci340] main (./ui/mpich/mpiexec.c:405): process manager
> error waiting for completion
> 
> Test.err:[mpiexec at ulca103] control_cb (./pm/pmiserv/pmiserv_cb.c:215):
> assert (!closed) failed SW_Test.err:
> 
> [mpiexec at ulca103] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:77): callback returned error status
> 
> SW_Test.err:[mpiexec at ulca103] HYD_pmci_wait_for_completion
> (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
> 
> _SW_Test.err:[mpiexec at ulca103] main (./ui/mpich/mpiexec.c:405): process
> manager error waiting for completion _2.err:[mpiexec at plch419] stdoe_cb
> (./ui/utils/uiu.c:309): assert (!closed) failed VG_2.err:
> 
> [mpiexec at plch419] control_cb (./pm/pmiserv/pmiserv_cb.c:306): error in the
> UI defined callback _2.err:[mpiexec at plch419]
> 
> HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback
> returned error status
> 
> VG_2.err:[mpiexec at plch419] HYD_pmci_wait_for_completion
> (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
> 
> VG_2.err:[mpiexec at plch419] main (./ui/mpich/mpiexec.c:405): process manager
> error waiting for completion
> 
> VG_CO_2.err:[mpiexec at plch374] stdoe_cb (./ui/utils/uiu.c:309): assert
> (!closed) failed
> 
> CO_2.err:[mpiexec at plch374] control_cb (./pm/pmiserv/pmiserv_cb.c:306):
> error in the UI defined callback
> 
>  ion available (required by /red/ct2/GP/GP_plch.exe)
> 
> [mpiexec at plch416] stdoe_cb (./ui/utils/uiu.c:309): assert (!closed) failed
> [mpiexec at plch416] control_cb (./pm/pmiserv/pmiserv_cb.c:306): error in the
> UI defined callback [mpiexec at plch416] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:77): callback returned error status
> [mpiexec at plch416] HYD_pmci_wait_for_completion
> (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event [mpiexec at plch416]
> main (./ui/mpich/mpiexec.c:405): process manager error waiting for
> completion
> 
> 
> thank you,
> 
> Walid

> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss


-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


More information about the mvapich-discuss mailing list