[mvapich-discuss] Help with polled desc error

Matthew Koop koop at cse.ohio-state.edu
Thu Feb 21 15:02:16 EST 2008


Sylvain,

Thanks. The error is a race-condition due to the freeing of resources
in the on-demand path (turning that off will also avoid the issue).

I've now checked in a fix into the trunk and 1.0 branches.

Matt

On Thu, 21 Feb 2008, Sylvain Jeaugey wrote:

> Hi All,
>
> I worked a bit on this issue and found out that the crash was located in
> IB_ring_based_alltoall. De-activating Ring startup will definetly avoid
> the crash.
>
> I did the following patch and it seems to solve the bug :
>
> --- rdma_iba_priv.c.orig	2008-02-21 11:33:22.000000000 +0100
> +++ rdma_iba_priv.c	2008-02-19 16:00:46.000000000 +0100
> @@ -1258,6 +1258,7 @@
>           }
>
>           /*Now all send and recv finished*/
> +        PMI_Barrier();
>       }
>   }
>
> I increased MV2_DEFAULT_TIME_OUT to 24 to let me some time before the
> error appears (when the timeout expires). I saw that the alltoall
> operation is letting some processes behind, blocked in the function (10
> out of 256, say).
>
> So, I'm suspecting that others destroyed something in their next
> operations and prevented the previous ones from completing the alltoall.
>
> Hope this helps,
>
> Sylvain
>
> On Tue, 12 Feb 2008, wei huang wrote:
>
> > Hi,
> >
> > We donot see anything abnormal from our local testing. In order to help us
> > locating the problem, could you please try the following:
> >
> > 1) Check if you have enough space in the /tmp directly
> >
> > 2) Disable ring based start using:
> >
> > mpiexec -n N -env MV2_USE_RING_STARTUP 0 ./a.out
> >
> > 3) If this fails, disable shared memory support using runtime variable
> > MV2_USE_SHARED_MEM=0:
> >
> > mpiexec -n N -env MV2_USE_SHARED_MEM 0 ./a.out
> >
> > Thanks.
> >
> > Regards,
> > Wei Huang
> >
> > 774 Dreese Lab, 2015 Neil Ave,
> > Dept. of Computer Science and Engineering
> > Ohio State University
> > OH 43210
> > Tel: (614)292-8501
> >
> >
> > On Tue, 12 Feb 2008, Le Yan wrote:
> >
> >> Hi,
> >>
> >> We have the same problem here with Mvapich2 1.0.1 on a Dell infiniband
> >> cluster. It has 8 cores per node and is running RHEL 4.5 (kernel
> >> 2.6.9-55). The OFED library version is 1.2.
> >>
> >> At first it seemed that any code compiled with Mvapich2 1.0.1 failed at
> >> the MPI_INIT stage when running with more than 128 procs. But later on
> >> we found that a code could run only if it doesn't use all 8 processors
> >> on the same node (which explains why mpiGraph never fails, because it
> >> uses only 1 processor per node). For example, a job running with 16
> >> nodes and 8 procs per node will fail, but one with 32 nodes and 4 procs
> >> per node will not.
> >>
> >> In addition, if the MALLOC_CHECK_ environment variable is set to 1, a
> >> bunch of errors appear in the standard error like this:
> >>
> >> 61: malloc: using debugging hooks
> >> 61: free(): invalid pointer 0x707000!
> >> 61: Fatal error in MPI_Init:
> >> 61: Other MPI error, error stack:
> >> 61: MPIR_Init_thread(259)..: Initialization failed
> >> 61: MPID_Init(102).........: channel initialization failed
> >> 61: MPIDI_CH3_Init(178)....:
> >> 61: MPIDI_CH3I_CM_Init(855): Error initializing MVAPICH2 malloc library
> >>
> >> I'm not quite sure what these messages mean, but sure it looks like a
> >> memory issue?
> >>
> >> Both Mvapich2 0.98 and Mvapich 1.0beta are fine on the same system.
> >>
> >> Cheers,
> >> Le
> >>
> >>
> >> On Fri, 2008-02-08 at 22:02 -0800, Shao-Ching Huang wrote:
> >>> Hi
> >>>
> >>> No failure was found in these mpiGraph runs. It's just that there is
> >>> significant variation among the entries of the matrices, compared to
> >>> another IB cluster of ours.
> >>>
> >>> http://reynolds.turb.ucla.edu/~schuang/mpiGraph/
> >>>
> >>> Thanks.
> >>>
> >>> Shao-Ching
> >>>
> >>>
> >>> On Fri, Feb 01, 2008 at 08:43:19PM -0500, wei huang wrote:
> >>>> Hi,
> >>>>
> >>>> How often do you observe the failures when running the mpiGraph test? Do
> >>>> all the failure happen at startup, as your simple program?
> >>>>
> >>>> Thanks.
> >>>>
> >>>> Regards,
> >>>> Wei Huang
> >>>>
> >>>> 774 Dreese Lab, 2015 Neil Ave,
> >>>> Dept. of Computer Science and Engineering
> >>>> Ohio State University
> >>>> OH 43210
> >>>> Tel: (614)292-8501
> >>>>
> >>>>
> >>>> On Fri, 1 Feb 2008, Shao-Ching Huang wrote:
> >>>>
> >>>>>
> >>>>> Hi Wei,
> >>>>>
> >>>>> We cleaned up a few things and re-ran the mpiGraph tests. The updated
> >>>>> results are posted here:
> >>>>>
> >>>>> http://reynolds.turb.ucla.edu/~schuang/mpiGraph/mpiGraph-8a.out_html/index.html
> >>>>> http://reynolds.turb.ucla.edu/~schuang/mpiGraph/mpiGraph-9a.out_html/index.html
> >>>>>
> >>>>> Please ignore results in my previous email. Thank you.
> >>>>>
> >>>>> Regards,
> >>>>> Shao-Ching
> >>>>>
> >>>>>
> >>>>> On Thu, Jan 31, 2008 at 08:35:41PM -0800, Shao-Ching Huang wrote:
> >>>>>>
> >>>>>> Hi Wei,
> >>>>>>
> >>>>>> We did 2 runs of mpiGraph that you suggested on 48 nodes, with one (1)
> >>>>>> MPI process per node:
> >>>>>>
> >>>>>> mpiexec -np 48 ./mpiGraph 4096 10 10 >& mpiGraph.out
> >>>>>>
> >>>>>> The results from the two runs are posted here:
> >>>>>>
> >>>>>> http://reynolds.turb.ucla.edu/~schuang/mpiGraph/mpiGraph-1.out_html/
> >>>>>> http://reynolds.turb.ucla.edu/~schuang/mpiGraph/mpiGraph-2.out_html/
> >>>>>>
> >>>>>> During the tests, some other users are also running jobs on some of
> >>>>>> these 48 nodes.
> >>>>>>
> >>>>>> Could you please help us interpret these results, if possible?
> >>>>>>
> >>>>>> Thank you.
> >>>>>>
> >>>>>> Shao-Ching Huang
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jan 31, 2008 at 01:05:06PM -0500, wei huang wrote:
> >>>>>>> Hi Scott,
> >>>>>>>
> >>>>>>> We went up to 256 processes (32 nodes) and did not see the problem in few
> >>>>>>> hundred runs (cpi). Thus, to narrow down the problem, we want to make sure
> >>>>>>> the fabrics and system setup are ok. To diagnose this, we suggest you
> >>>>>>> running mpiGraph program from http://sourceforge.net/projects/mpigraph.
> >>>>>>> This test stresses the interconnects. It should fail at a much higher
> >>>>>>> frequency than simple cpi program if there is a problem with your system
> >>>>>>> setup.
> >>>>>>>
> >>>>>>> Thanks.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Wei Huang
> >>>>>>>
> >>>>>>> 774 Dreese Lab, 2015 Neil Ave,
> >>>>>>> Dept. of Computer Science and Engineering
> >>>>>>> Ohio State University
> >>>>>>> OH 43210
> >>>>>>> Tel: (614)292-8501
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, 30 Jan 2008, Scott A. Friedman wrote:
> >>>>>>>
> >>>>>>>> My co-worker passed this along...
> >>>>>>>>
> >>>>>>>> Yes, the error happens on the cpi.c program too.  It happened 2 times
> >>>>>>>> among the 9 cases I ran.
> >>>>>>>>
> >>>>>>>> I was using 128 processes (on 32 4-core nodes).
> >>>>>>>>
> >>>>>>>> ---
> >>>>>>>>
> >>>>>>>> and another...
> >>>>>>>>
> >>>>>>>>    It happens for a simple MPI program which just does MPI_Init and
> >>>>>>>> MPI_Finalize and print out number of processors.  It happened for
> >>>>>>>> anything from 4 nodes (16 processors ) and more.
> >>>>>>>>
> >>>>>>>> What environment variables should we look for?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Scott
> >>>>>>>>
> >>>>>>>> wei huang wrote:
> >>>>>>>>> Hi Scott,
> >>>>>>>>>
> >>>>>>>>> On how many processes (and how many nodes) you ran your program? Do you
> >>>>>>>>> have any environmental variables when you are running the program? Does
> >>>>>>>>> the error happen on simple test like cpi?
> >>>>>>>>>
> >>>>>>>>> Thanks.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Wei Huang
> >>>>>>>>>
> >>>>>>>>> 774 Dreese Lab, 2015 Neil Ave,
> >>>>>>>>> Dept. of Computer Science and Engineering
> >>>>>>>>> Ohio State University
> >>>>>>>>> OH 43210
> >>>>>>>>> Tel: (614)292-8501
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, 30 Jan 2008, Scott A. Friedman wrote:
> >>>>>>>>>
> >>>>>>>>>> The low level ibv tests work fine.
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> mvapich-discuss mailing list
> >>>>>>>>> mvapich-discuss at cse.ohio-state.edu
> >>>>>>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> mvapich-discuss mailing list
> >>>>>>> mvapich-discuss at cse.ohio-state.edu
> >>>>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >>>>>> _______________________________________________
> >>>>>> mvapich-discuss mailing list
> >>>>>> mvapich-discuss at cse.ohio-state.edu
> >>>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >>>>>
> >>> _______________________________________________
> >>> mvapich-discuss mailing list
> >>> mvapich-discuss at cse.ohio-state.edu
> >>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >>>
> >> --
> >> Le Yan
> >> User support
> >> Louisiana Optical Network Initiative (LONI)
> >> Office: 225-578-7524
> >> Fax: 225-578-6400
> >>
> >>
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >
> >
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>




More information about the mvapich-discuss mailing list