[mvapich-discuss] Odd behavior with memory registration / dreg / MVAPICH and MVAPICH2

Dhabaleswar Panda panda at cse.ohio-state.edu
Fri Jan 4 13:23:20 EST 2008


Hi Eric,

Thanks for your suggestions. We will make these changes to vapi and
vapi_multirail devices and add the information to the user guides too.

Thanks,

DK

> Lei,
>
> Thanks for the information. I would suggest that, if this can't be
> fixed in the vapi version, then the LAZY_MEM_UNREGISTER define should
> be removed from the default compile options for the versions where it
> is (apparently) not fully supported.
>
> This is a very nasty bug. The MPI layer reports back no errors, but
> the data isn't actually transferred successfully. In addition, it
> presents as a timing / waiting error to the user, as all of the local
> (shared mem) peers transfer data successfully, so significant time can
> be spent chasing down a suspected user oversight for what is actually
> an error within the MPI layer.
>
> This would apply to the MVAPICH and MVAPICH2, in both the vapi and
> vapi_multirail makefiles.
>
> In addition, it should be documented that the LAZY_MEM_UNREGISTER
> switch is NOT compatible with vapi-based channels.
>
> Thanks,
>  Eric
>
> On Dec 21, 2007 5:29 PM, LEI CHAI <chai.15 at osu.edu> wrote:
> > Hi Eric,
> >
> > Thanks for using mvapich/mvapich2. The problem you reported can be solved by using the PTMALLOC feature which is supported by the gen2 device but not vapi/vapi_multirail. Not much features have been added to vapi/vapi_multirail devices for the last few releases because not many people use them. Since you cannot move to gen2, we would suggest you disable LAZY_MEM_UNREGISTER for your tests.
> >
> > Thanks,
> > Lei
> >
> >
> >
> > ----- Original Message -----
> > From: "Eric A. Borisch" <eborisch at ieee.org>
> > Date: Friday, December 21, 2007 10:23 am
> > Subject: [mvapich-discuss] Odd behavior with memory registration / dreg / MVAPICH and MVAPICH2
> >
> > > I seem to be running into a memory registration issue.
> > >
> > > Observations:
> > >
> > > 1) During some transfers (MPI_Isend / MPI_Irecv / MPI_Waitall)
> > > into a
> > > local buffer on the root rank, I receive all of the data from any
> > > ranks that are running on the same machine, but only part (or none at
> > > all) of the data from ranks running on external machines. The transfer
> > > length is above the eager/rendezvous threshold.
> > > 2) Once the problem occurs, it is persistent. However, if I force
> > > MVAPICH to re-register by calling "while(dreg_evict())" at this point
> > > and then re-transfer, the correct data is received. (Same memory being
> > > transferred from / to.)
> > > 3) I've only witnessed problems occurring above the 4G (as
> > > returned by
> > > malloc()) memory range.
> > > 4) When I receive partial data from ranks, the data ends on a (4k)
> > > page bound. Data past this bound (which should have been updated) is
> > > unchanged during the transfer, yet both the sender and receiver report
> > > no errors. (This is very bad!)
> > > 5) Stepping through the code on both ends of the transfer shows the
> > > software agreeing on the (correct) length and location as far down as
> > > I can follow it.
> > > 6) Running against a compilation with no -DLAZY_MEM_UNREGISTER shows
> > > no issues. (Other than the expected performance hit.)
> > > 7) Occurs on both MVAPICH-1.0-beta (vapi_multirail) and mvapich2-
> > > 1.0 (vapi)
> > > 8) The user code is also sending data out (from a different buffer)
> > > over ethernet to a remote gui from the root node.
> > >
> > > I can't move to gen2 at this point -- we are using a vendor library
> > > for interfacing to another system, and this library uses VAPI.
> > >
> > > uname -a output:
> > > Linux rt2.mayo.edu 2.6.9-42.0.2.ELsmp #1 SMP Wed Aug 23 13:38:27 BST
> > > 2006 x86_64 x86_64 x86_64 GNU/Linux
> > >
> > > Intel SE7520JR2 motherboards. 4G physical ram on each node.
> > >
> > > It appears (perhaps this is obvious) that the assumption that memory
> > > registered (by the dreg.c code) remains registered until explicitly
> > > unregistered (again, by the dreg.c code) is being violated in some
> > > way. This, however, is wading in to uncharted (for me, at least) linux
> > > memory management waters. The user code is doing nothing to fiddle
> > > with registration in any explicit way. (With the exception of as
> > > mentioned in (2))
> > >
> > > Please let me know what other information I can provide to resolve
> > > this. I'm still trying to put together a small test program to cause
> > > the problem, but have been unsuccessful so far.
> > >
> > > Thanks,
> > > Eric
> > > --
> > > Eric A. Borisch
> > > eborisch at ieee.org
> > > _______________________________________________
> > > mvapich-discuss mailing list
> > > mvapich-discuss at cse.ohio-state.edu
> > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > >
> >
> >
>
>
>
> --
> Eric A. Borisch
> eborisch at ieee.org
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list