[mvapich-discuss] hydra errors

Walid walid.shaari at gmail.com
Mon Jun 4 08:08:26 EDT 2012


Dear Jonathan,

it is 1.7 compiled with intel 10 for application compatibility and system
reason the configuration as below

 /usr/local/mpi/mvapich2/intel10/1.7/bin/mpiname -o
Configuration
--prefix=/usr/local/mpi/mvapich2/intel10/1.7 --with-device=ch3:psm
--enable-g=dbg --enable-romio --enable-debuginfo
-with-file-system=panfs+nfs+ufs --with-psm-include=/usr/include
--with-psm=/usr/lib64

On 4 June 2012 02:05, Jonathan Perkins <perkinjo at cse.ohio-state.edu> wrote:

> Thanks for the report Walid.  Can you tell us the version of MVAPICH2
> being used and whether or not this is reproduceable with the
> OSU-Micro-Benchmarks?  Providing the configuration options used to build
> MVAPICH2 as well as architecture information of the machines this is
> being run on may be helpful as well.
>
> On Sun, Jun 03, 2012 at 01:56:55PM +0300, Walid wrote:
> > Dear all,
> >
> > One of the users have reported that almost all of his jobs die when he
> run
> > using mvapich2, below are the error messages, he is using a simple call:
> >
> >                                 mpirun program program options  > output
> > file
> >
> > I have asked him to use --stdout=output file, and mpiexec.hydra, he did
> not
> > come back to me yet with if it was successful or not, however i wanted to
> > see if these errors were seen before or not
> >
> >
> > [mpiexec at plci340] stdoe_cb (./ui/utils/uiu.c:309): assert (!closed)
> failed
> >
> > porgram.err:[mpiexec at plci340] control_cb
> (./pm/pmiserv/pmiserv_cb.c:306):
> > error in the UI defined callback
> >
> > H.err:[mpiexec at plci340] HYDT_dmxu_poll_wait_for_event
> > (./tools/demux/demux_poll.c:77): callback returned error status
> > SW.err:[mpiexec at plci340] HYD_pmci_wait_for_completion
> > (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
> > SW.err:[mpiexec at plci340] main (./ui/mpich/mpiexec.c:405): process
> manager
> > error waiting for completion
> >
> > Test.err:[mpiexec at ulca103] control_cb (./pm/pmiserv/pmiserv_cb.c:215):
> > assert (!closed) failed SW_Test.err:
> >
> > [mpiexec at ulca103] HYDT_dmxu_poll_wait_for_event
> > (./tools/demux/demux_poll.c:77): callback returned error status
> >
> > SW_Test.err:[mpiexec at ulca103] HYD_pmci_wait_for_completion
> > (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
> >
> > _SW_Test.err:[mpiexec at ulca103] main (./ui/mpich/mpiexec.c:405): process
> > manager error waiting for completion _2.err:[mpiexec at plch419] stdoe_cb
> > (./ui/utils/uiu.c:309): assert (!closed) failed VG_2.err:
> >
> > [mpiexec at plch419] control_cb (./pm/pmiserv/pmiserv_cb.c:306): error in
> the
> > UI defined callback _2.err:[mpiexec at plch419]
> >
> > HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback
> > returned error status
> >
> > VG_2.err:[mpiexec at plch419] HYD_pmci_wait_for_completion
> > (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
> >
> > VG_2.err:[mpiexec at plch419] main (./ui/mpich/mpiexec.c:405): process
> manager
> > error waiting for completion
> >
> > VG_CO_2.err:[mpiexec at plch374] stdoe_cb (./ui/utils/uiu.c:309): assert
> > (!closed) failed
> >
> > CO_2.err:[mpiexec at plch374] control_cb (./pm/pmiserv/pmiserv_cb.c:306):
> > error in the UI defined callback
> >
> >  ion available (required by /red/ct2/GP/GP_plch.exe)
> >
> > [mpiexec at plch416] stdoe_cb (./ui/utils/uiu.c:309): assert (!closed)
> failed
> > [mpiexec at plch416] control_cb (./pm/pmiserv/pmiserv_cb.c:306): error in
> the
> > UI defined callback [mpiexec at plch416] HYDT_dmxu_poll_wait_for_event
> > (./tools/demux/demux_poll.c:77): callback returned error status
> > [mpiexec at plch416] HYD_pmci_wait_for_completion
> > (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
> [mpiexec at plch416]
> > main (./ui/mpich/mpiexec.c:405): process manager error waiting for
> > completion
> >
> >
> > thank you,
> >
> > Walid
>
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
> --
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120604/436fb09f/attachment-0001.html


More information about the mvapich-discuss mailing list