[mvapich-discuss] uninitialized struct member leading to MVAPICH 1.0 segfault?

rahul kumar kumarra at cse.ohio-state.edu
Fri Jul 18 16:49:16 EDT 2008


Hi David,
Thanks for reporting this and for your analysis.
You are correct that the variable shmem_comm_rank is not being initialized
in else part of the below statements
    if (shmem_comm_count < shmem_coll_blocks){
        shmem_ptr->shmem_comm_rank = shmem_comm_count;
        input_flag = 1;
    }
    else{
        input_flag = 0;
    }
However, that actually might not be required. Although shmem_comm_rank is
not initialized but input_flag is set 0. If you follow the variable
input_flag in the following statements:
    MPI_Allreduce(&input_flag, &output_flag, 1, MPI_INT, MPI_LAND,
comm_ptr->self);

The above statement would set output_flag as 0 if any one of the processes
has input_flag as 0. Based on that the variable shmem_coll_ok is set to 0
in the following part of the code in the same create_2level_comm()
function.
    if (output_flag == 1){
        comm_ptr->shmem_coll_ok = 1;
    }
    else{
        comm_ptr->shmem_coll_ok = 0;
    }

If you see in the free_2level_comm() function at the place where the
variable shmem_comm_rank is dereferenced. The dereferencing happens only
when shmem_coll_ok is 1 which would not be in our case.
        if ((my_local_id == 0)&&(comm_ptr->shmem_coll_ok == 1)){
            pthread_spin_lock(&shmem_coll->shmem_coll_lock);
            shmem_coll_obj.shmem_avail[shmem_ptr->shmem_comm_rank] = 1;
            pthread_spin_unlock(&shmem_coll->shmem_coll_lock);
        }

So therefore, not initializing the variable shmem_comm_rank should not
cause a problem.

If you could send us a reproducer and/or backtrace of the segfault. We
will be happy to help you.

Regards,
rahul.


On Thu, 17 Jul 2008 David_Kewley at Dell.com wrote:

> I have an MVAPICH 1.0 program segfaulting, and I think I may have traced
> it
> back to MVAPICH's failure to initialize a struct member before using it.
> We
> are testing a speculative fix right now.  The full story  follows; let
> me
> know what you think.
>
> struct MPI_COMMUNICATOR member shmem_comm_rank is only set in one place
> as far
> as I can see:
>
> src/context/create_2level_comm.c:
>
>     100 void create_2level_comm (struct MPIR_COMMUNICATOR* comm_ptr, int
> size, int my_rank){
>     ...
>     208     if (shmem_comm_count < shmem_coll_blocks){
>     209         shmem_ptr->shmem_comm_rank = shmem_comm_count;
>     210         input_flag = 1;
>     211     }
>     212     else{
>     213         input_flag = 0;
>     214     }
>     ...
>     277 }
>
> Note that shmem_comm_rank is set only if the condition holds; if the
> condition
> does not hold, then the value of shmem_comm_rank is whatever happened to
> be
> in memory at that point.  So, what might that value be?
>
> Best I can figure out, memory for a struct MPIR_COMMUNICATOR is always
> allocated using malloc().  My manpage for malloc says that malloc() does
> not
> clear the memory it allocates, which I take to mean it does not set the
> memory contents to zero, but simply leaves it as it was.  So if malloc()
>
> chooses to allocate memory which was previously free()'d, then the
> memory
> handed to the requester may have inappropriate, nonzero data in it.  I
> do not
> know for sure what happens if the memory happens to be freshly granted
> by the
> kernel, but I suspect in this case it is guaranteed to be zeroed by the
> kernel.
>
> So...  If the condition (shmem_comm_count < shmem_coll_blocks) does not
> hold,
> then shmem_comm_rank is not initialized.  If it is later referenced, its
>
> value is meaningless and may lead to an error.
>
> I believe that is what is happening to us; the major unknown at this
> point is
> whether we are in fact hitting the "else" part of the above clause.  I'd
> love
> your  comments about what is likely the case, and how we can tell
> without
> doing a printf() or similar. :)
>
> Eventually we see a segfault in free_2level_comm():
>
> src/context/create_2level_comm.c:
>
>      62 void free_2level_comm (struct MPIR_COMMUNICATOR* comm_ptr)
>      63 {
>     ...
>      87     if (comm_ptr->shmem_comm != MPI_COMM_NULL)  {
>      88             struct MPIR_COMMUNICATOR* shmem_ptr;
>      89             shmem_ptr= MPIR_GET_COMM_PTR(comm_ptr->shmem_comm);
>      90             pthread_spin_lock(&shmem_coll->shmem_coll_lock);
>      91
> shmem_coll_obj.shmem_avail[shmem_ptr->shmem_comm_rank] = 1;
>      92             pthread_spin_unlock(&shmem_coll->shmem_coll_lock);
>      93             MPI_Comm_free(&(comm_ptr->shmem_comm));
>      94     }
>     ...
>      98 }
>
> The segfault happens at line 91, because it appears that
> shmem_ptr->shmem_comm_rank is a large negative number.  I suspect in
> fact
> shmem_comm_rank was never initialized (see above), which means the
> negative
> number is an "accidental" value [1].
>
> We only see this segfault in around 1 out of 20 runs of a particular
> application.  I suspect the ~1/20 hit rate is simply accidents of how
> memory
> gets allocated in each run.  Sometimes shmem_ptr->shmem_comm_rank
> probably
> happens to sit in a memory location that contains 0, so the above code
> does
> not cause a segfault.
>
> I suspect the fact that we've only noticed this in one code may be an
> accident; I do not assume it is significant.  We may not have visibility
> to
> whether other codes are hitting this segfault mechanism.
>
> Do you agree that this failure to initialize shmem_comm_rank is a bug?
> If so,
> probably the right fix is to add "shmem_ptr->shmem_comm_rank = 0;" to
> the "else" clause in the first code snippet above.  Would you agree?
> That is
> the fix we are testing right now.  Or should it be done in a
> structure-initialization operation somehow?
>
> Mind you, I don't know whether it is *semantically* correct to set
> shmem_comm_rank to 0 by default.  I am doing it simply because it
> replicates
> the likely common case (~19 out of 20 runs) where the contents of that
> memory
> location often just happen to be cleared to zero.
>
> Finding this bug raises a question: How do we guarantee that there are
> not
> other unrecognized problems like this one?  How to we check for use of
> uninitialized variables (e.g. structure members) allocated by malloc()?
> Is
> it best practice to do a memset(x, 0, sizeof(x))?  This is a C-coding
> best-practices question, and also a question about how MPICH and MVAPICH
> are
> coded.
>
> Thanks,
> David
>
>
> [1] On x86_64 an int is 4 bytes and a pointer is 8 bytes.  Looking at
> the
> contents of the 8 bytes starting at &(shmem_ptr->shmem_comm_rank), they
> appear to be a valid pointer value similar to other pointer values I see
> in
> this core dump.  I do not know what this pointer points to (or pointed
> to in
> the past).  We get shmem_comm_rank interpreted as a large negative
> number
> simply because the MSbit of the first four bytes happens to be set.
>
> I think it is incontrovertible that these eight bytes hold a pointer
> value
> that was at some point valid.  This value could have been written to
> memory
> before the *MPIR_COMMUNICATOR was allocated (presumably part of an
> object
> that was free()'d).  This is the hypothesis I explore above.
>
> It's also possible that this pointer was written to those eight bytes
> *after*
> the *MPIR_COMMUNICATOR was created.  That is, someone is stomping on our
>
> structure.  If that is the case, we should still see segfaults after
> fixing
> the failure to initialize shmem_comm_rank.  We're doing runs right now
> in
> which shmem_comm_rank is also initialized (to 0) in the "else" clause,
> to
> check this possibility.
>
> The final possibility is that a legitimate user of this structure is
> writing
> this pointer value inappropriately.  I think this is very unlikely,
> assuming
> this problem is not caused by a compiler bug, because the source code
> only
> writes to shmem_coll_rank in one place that I can see, and the code
> logically
> can only write an integer value.
>
> Regardless of the outcome of those tests, however, it is definitely a
> bug not
> to initialize shmem_comm_rank before it is used, unless I'm missing
> something.
>
>
> David Kewley
> Dell Infrastructure Consulting Services
> Onsite Engineer at the Maui HPC Center
> Cell: 602-460-7617
> David_Kewley at Dell.com
>
> Dell Services: http://www.dell.com/services/
> How am I doing? Email my manager Russell_Kelly at Dell.com with any
> feedback.
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list