[mvapich-discuss] uninitialized struct member leading to MVAPICH 1.0 segfault?

David_Kewley at Dell.com David_Kewley at Dell.com
Fri Jul 18 17:32:56 EDT 2008


Rahul,

Thanks for your detailed analysis.  It looks like we are looking at
different versions of free_2level_comm().  Mine (mvapich-1.0) has no
mention of shmem_coll_ok whatsoever, but I see that mvapich-1.0.1 has
the check on shmem_coll_ok, which I agree should avoid this  issue.

It appears that 1.0.1 fixed this particular problem that exists in 1.0.

Looks like it's time for us to figure out how to organize an upgrade.
We're generally a bit slow in order to minimize churn for our users.

How does one best avoid or detect this class of problems in general
(using uninitialized variables)?  Seems too much to ask the programmer
to be careful about each and every possible instance of this problem in
detail.  Is memset(x,0,sizeof(x)) the general answer?

Thanks!
David

> -----Original Message-----
> From: rahul kumar [mailto:kumarra at cse.ohio-state.edu]
> Sent: Friday, July 18, 2008 10:49 AM
> To: Kewley, David
> Cc: mvapich-discuss at cse.ohio-state.edu
> Subject: Re: [mvapich-discuss] uninitialized struct member leading to
> MVAPICH 1.0 segfault?
> 
> Hi David,
> Thanks for reporting this and for your analysis.
> You are correct that the variable shmem_comm_rank is not being
initialized
> in else part of the below statements
>     if (shmem_comm_count < shmem_coll_blocks){
>         shmem_ptr->shmem_comm_rank = shmem_comm_count;
>         input_flag = 1;
>     }
>     else{
>         input_flag = 0;
>     }
> However, that actually might not be required. Although shmem_comm_rank
is
> not initialized but input_flag is set 0. If you follow the variable
> input_flag in the following statements:
>     MPI_Allreduce(&input_flag, &output_flag, 1, MPI_INT, MPI_LAND,
> comm_ptr->self);
> 
> The above statement would set output_flag as 0 if any one of the
processes
> has input_flag as 0. Based on that the variable shmem_coll_ok is set
to 0
> in the following part of the code in the same create_2level_comm()
> function.
>     if (output_flag == 1){
>         comm_ptr->shmem_coll_ok = 1;
>     }
>     else{
>         comm_ptr->shmem_coll_ok = 0;
>     }
> 
> If you see in the free_2level_comm() function at the place where the
> variable shmem_comm_rank is dereferenced. The dereferencing happens
only
> when shmem_coll_ok is 1 which would not be in our case.
>         if ((my_local_id == 0)&&(comm_ptr->shmem_coll_ok == 1)){
>             pthread_spin_lock(&shmem_coll->shmem_coll_lock);
>             shmem_coll_obj.shmem_avail[shmem_ptr->shmem_comm_rank] =
1;
>             pthread_spin_unlock(&shmem_coll->shmem_coll_lock);
>         }
> 
> So therefore, not initializing the variable shmem_comm_rank should not
> cause a problem.
> 
> If you could send us a reproducer and/or backtrace of the segfault. We
> will be happy to help you.
> 
> Regards,
> rahul.
> 
> 
> On Thu, 17 Jul 2008 David_Kewley at Dell.com wrote:
> 
> > I have an MVAPICH 1.0 program segfaulting, and I think I may have
traced
> > it
> > back to MVAPICH's failure to initialize a struct member before using
it.
> > We
> > are testing a speculative fix right now.  The full story  follows;
let
> > me
> > know what you think.
> >
> > struct MPI_COMMUNICATOR member shmem_comm_rank is only set in one
place
> > as far
> > as I can see:
> >
> > src/context/create_2level_comm.c:
> >
> >     100 void create_2level_comm (struct MPIR_COMMUNICATOR* comm_ptr,
int
> > size, int my_rank){
> >     ...
> >     208     if (shmem_comm_count < shmem_coll_blocks){
> >     209         shmem_ptr->shmem_comm_rank = shmem_comm_count;
> >     210         input_flag = 1;
> >     211     }
> >     212     else{
> >     213         input_flag = 0;
> >     214     }
> >     ...
> >     277 }
> >
> > Note that shmem_comm_rank is set only if the condition holds; if the
> > condition
> > does not hold, then the value of shmem_comm_rank is whatever
happened to
> > be
> > in memory at that point.  So, what might that value be?
> >
> > Best I can figure out, memory for a struct MPIR_COMMUNICATOR is
always
> > allocated using malloc().  My manpage for malloc says that malloc()
does
> > not
> > clear the memory it allocates, which I take to mean it does not set
the
> > memory contents to zero, but simply leaves it as it was.  So if
malloc()
> >
> > chooses to allocate memory which was previously free()'d, then the
> > memory
> > handed to the requester may have inappropriate, nonzero data in it.
I
> > do not
> > know for sure what happens if the memory happens to be freshly
granted
> > by the
> > kernel, but I suspect in this case it is guaranteed to be zeroed by
the
> > kernel.
> >
> > So...  If the condition (shmem_comm_count < shmem_coll_blocks) does
not
> > hold,
> > then shmem_comm_rank is not initialized.  If it is later referenced,
its
> >
> > value is meaningless and may lead to an error.
> >
> > I believe that is what is happening to us; the major unknown at this
> > point is
> > whether we are in fact hitting the "else" part of the above clause.
I'd
> > love
> > your  comments about what is likely the case, and how we can tell
> > without
> > doing a printf() or similar. :)
> >
> > Eventually we see a segfault in free_2level_comm():
> >
> > src/context/create_2level_comm.c:
> >
> >      62 void free_2level_comm (struct MPIR_COMMUNICATOR* comm_ptr)
> >      63 {
> >     ...
> >      87     if (comm_ptr->shmem_comm != MPI_COMM_NULL)  {
> >      88             struct MPIR_COMMUNICATOR* shmem_ptr;
> >      89             shmem_ptr=
MPIR_GET_COMM_PTR(comm_ptr->shmem_comm);
> >      90             pthread_spin_lock(&shmem_coll->shmem_coll_lock);
> >      91
> > shmem_coll_obj.shmem_avail[shmem_ptr->shmem_comm_rank] = 1;
> >      92
pthread_spin_unlock(&shmem_coll->shmem_coll_lock);
> >      93             MPI_Comm_free(&(comm_ptr->shmem_comm));
> >      94     }
> >     ...
> >      98 }
> >
> > The segfault happens at line 91, because it appears that
> > shmem_ptr->shmem_comm_rank is a large negative number.  I suspect in
> > fact
> > shmem_comm_rank was never initialized (see above), which means the
> > negative
> > number is an "accidental" value [1].
> >
> > We only see this segfault in around 1 out of 20 runs of a particular
> > application.  I suspect the ~1/20 hit rate is simply accidents of
how
> > memory
> > gets allocated in each run.  Sometimes shmem_ptr->shmem_comm_rank
> > probably
> > happens to sit in a memory location that contains 0, so the above
code
> > does
> > not cause a segfault.
> >
> > I suspect the fact that we've only noticed this in one code may be
an
> > accident; I do not assume it is significant.  We may not have
visibility
> > to
> > whether other codes are hitting this segfault mechanism.
> >
> > Do you agree that this failure to initialize shmem_comm_rank is a
bug?
> > If so,
> > probably the right fix is to add "shmem_ptr->shmem_comm_rank = 0;"
to
> > the "else" clause in the first code snippet above.  Would you agree?
> > That is
> > the fix we are testing right now.  Or should it be done in a
> > structure-initialization operation somehow?
> >
> > Mind you, I don't know whether it is *semantically* correct to set
> > shmem_comm_rank to 0 by default.  I am doing it simply because it
> > replicates
> > the likely common case (~19 out of 20 runs) where the contents of
that
> > memory
> > location often just happen to be cleared to zero.
> >
> > Finding this bug raises a question: How do we guarantee that there
are
> > not
> > other unrecognized problems like this one?  How to we check for use
of
> > uninitialized variables (e.g. structure members) allocated by
malloc()?
> > Is
> > it best practice to do a memset(x, 0, sizeof(x))?  This is a
C-coding
> > best-practices question, and also a question about how MPICH and
MVAPICH
> > are
> > coded.
> >
> > Thanks,
> > David
> >
> >
> > [1] On x86_64 an int is 4 bytes and a pointer is 8 bytes.  Looking
at
> > the
> > contents of the 8 bytes starting at &(shmem_ptr->shmem_comm_rank),
they
> > appear to be a valid pointer value similar to other pointer values I
see
> > in
> > this core dump.  I do not know what this pointer points to (or
pointed
> > to in
> > the past).  We get shmem_comm_rank interpreted as a large negative
> > number
> > simply because the MSbit of the first four bytes happens to be set.
> >
> > I think it is incontrovertible that these eight bytes hold a pointer
> > value
> > that was at some point valid.  This value could have been written to
> > memory
> > before the *MPIR_COMMUNICATOR was allocated (presumably part of an
> > object
> > that was free()'d).  This is the hypothesis I explore above.
> >
> > It's also possible that this pointer was written to those eight
bytes
> > *after*
> > the *MPIR_COMMUNICATOR was created.  That is, someone is stomping on
our
> >
> > structure.  If that is the case, we should still see segfaults after
> > fixing
> > the failure to initialize shmem_comm_rank.  We're doing runs right
now
> > in
> > which shmem_comm_rank is also initialized (to 0) in the "else"
clause,
> > to
> > check this possibility.
> >
> > The final possibility is that a legitimate user of this structure is
> > writing
> > this pointer value inappropriately.  I think this is very unlikely,
> > assuming
> > this problem is not caused by a compiler bug, because the source
code
> > only
> > writes to shmem_coll_rank in one place that I can see, and the code
> > logically
> > can only write an integer value.
> >
> > Regardless of the outcome of those tests, however, it is definitely
a
> > bug not
> > to initialize shmem_comm_rank before it is used, unless I'm missing
> > something.
> >
> >
> > David Kewley
> > Dell Infrastructure Consulting Services
> > Onsite Engineer at the Maui HPC Center
> > Cell: 602-460-7617
> > David_Kewley at Dell.com
> >
> > Dell Services: http://www.dell.com/services/
> > How am I doing? Email my manager Russell_Kelly at Dell.com with any
> > feedback.
> >
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >




More information about the mvapich-discuss mailing list