[mvapich-discuss] uninitialized struct member leading to MVAPICH
1.0 segfault?
rahul kumar
kumarra at cse.ohio-state.edu
Fri Jul 18 16:49:16 EDT 2008
Hi David,
Thanks for reporting this and for your analysis.
You are correct that the variable shmem_comm_rank is not being initialized
in else part of the below statements
if (shmem_comm_count < shmem_coll_blocks){
shmem_ptr->shmem_comm_rank = shmem_comm_count;
input_flag = 1;
}
else{
input_flag = 0;
}
However, that actually might not be required. Although shmem_comm_rank is
not initialized but input_flag is set 0. If you follow the variable
input_flag in the following statements:
MPI_Allreduce(&input_flag, &output_flag, 1, MPI_INT, MPI_LAND,
comm_ptr->self);
The above statement would set output_flag as 0 if any one of the processes
has input_flag as 0. Based on that the variable shmem_coll_ok is set to 0
in the following part of the code in the same create_2level_comm()
function.
if (output_flag == 1){
comm_ptr->shmem_coll_ok = 1;
}
else{
comm_ptr->shmem_coll_ok = 0;
}
If you see in the free_2level_comm() function at the place where the
variable shmem_comm_rank is dereferenced. The dereferencing happens only
when shmem_coll_ok is 1 which would not be in our case.
if ((my_local_id == 0)&&(comm_ptr->shmem_coll_ok == 1)){
pthread_spin_lock(&shmem_coll->shmem_coll_lock);
shmem_coll_obj.shmem_avail[shmem_ptr->shmem_comm_rank] = 1;
pthread_spin_unlock(&shmem_coll->shmem_coll_lock);
}
So therefore, not initializing the variable shmem_comm_rank should not
cause a problem.
If you could send us a reproducer and/or backtrace of the segfault. We
will be happy to help you.
Regards,
rahul.
On Thu, 17 Jul 2008 David_Kewley at Dell.com wrote:
> I have an MVAPICH 1.0 program segfaulting, and I think I may have traced
> it
> back to MVAPICH's failure to initialize a struct member before using it.
> We
> are testing a speculative fix right now. The full story follows; let
> me
> know what you think.
>
> struct MPI_COMMUNICATOR member shmem_comm_rank is only set in one place
> as far
> as I can see:
>
> src/context/create_2level_comm.c:
>
> 100 void create_2level_comm (struct MPIR_COMMUNICATOR* comm_ptr, int
> size, int my_rank){
> ...
> 208 if (shmem_comm_count < shmem_coll_blocks){
> 209 shmem_ptr->shmem_comm_rank = shmem_comm_count;
> 210 input_flag = 1;
> 211 }
> 212 else{
> 213 input_flag = 0;
> 214 }
> ...
> 277 }
>
> Note that shmem_comm_rank is set only if the condition holds; if the
> condition
> does not hold, then the value of shmem_comm_rank is whatever happened to
> be
> in memory at that point. So, what might that value be?
>
> Best I can figure out, memory for a struct MPIR_COMMUNICATOR is always
> allocated using malloc(). My manpage for malloc says that malloc() does
> not
> clear the memory it allocates, which I take to mean it does not set the
> memory contents to zero, but simply leaves it as it was. So if malloc()
>
> chooses to allocate memory which was previously free()'d, then the
> memory
> handed to the requester may have inappropriate, nonzero data in it. I
> do not
> know for sure what happens if the memory happens to be freshly granted
> by the
> kernel, but I suspect in this case it is guaranteed to be zeroed by the
> kernel.
>
> So... If the condition (shmem_comm_count < shmem_coll_blocks) does not
> hold,
> then shmem_comm_rank is not initialized. If it is later referenced, its
>
> value is meaningless and may lead to an error.
>
> I believe that is what is happening to us; the major unknown at this
> point is
> whether we are in fact hitting the "else" part of the above clause. I'd
> love
> your comments about what is likely the case, and how we can tell
> without
> doing a printf() or similar. :)
>
> Eventually we see a segfault in free_2level_comm():
>
> src/context/create_2level_comm.c:
>
> 62 void free_2level_comm (struct MPIR_COMMUNICATOR* comm_ptr)
> 63 {
> ...
> 87 if (comm_ptr->shmem_comm != MPI_COMM_NULL) {
> 88 struct MPIR_COMMUNICATOR* shmem_ptr;
> 89 shmem_ptr= MPIR_GET_COMM_PTR(comm_ptr->shmem_comm);
> 90 pthread_spin_lock(&shmem_coll->shmem_coll_lock);
> 91
> shmem_coll_obj.shmem_avail[shmem_ptr->shmem_comm_rank] = 1;
> 92 pthread_spin_unlock(&shmem_coll->shmem_coll_lock);
> 93 MPI_Comm_free(&(comm_ptr->shmem_comm));
> 94 }
> ...
> 98 }
>
> The segfault happens at line 91, because it appears that
> shmem_ptr->shmem_comm_rank is a large negative number. I suspect in
> fact
> shmem_comm_rank was never initialized (see above), which means the
> negative
> number is an "accidental" value [1].
>
> We only see this segfault in around 1 out of 20 runs of a particular
> application. I suspect the ~1/20 hit rate is simply accidents of how
> memory
> gets allocated in each run. Sometimes shmem_ptr->shmem_comm_rank
> probably
> happens to sit in a memory location that contains 0, so the above code
> does
> not cause a segfault.
>
> I suspect the fact that we've only noticed this in one code may be an
> accident; I do not assume it is significant. We may not have visibility
> to
> whether other codes are hitting this segfault mechanism.
>
> Do you agree that this failure to initialize shmem_comm_rank is a bug?
> If so,
> probably the right fix is to add "shmem_ptr->shmem_comm_rank = 0;" to
> the "else" clause in the first code snippet above. Would you agree?
> That is
> the fix we are testing right now. Or should it be done in a
> structure-initialization operation somehow?
>
> Mind you, I don't know whether it is *semantically* correct to set
> shmem_comm_rank to 0 by default. I am doing it simply because it
> replicates
> the likely common case (~19 out of 20 runs) where the contents of that
> memory
> location often just happen to be cleared to zero.
>
> Finding this bug raises a question: How do we guarantee that there are
> not
> other unrecognized problems like this one? How to we check for use of
> uninitialized variables (e.g. structure members) allocated by malloc()?
> Is
> it best practice to do a memset(x, 0, sizeof(x))? This is a C-coding
> best-practices question, and also a question about how MPICH and MVAPICH
> are
> coded.
>
> Thanks,
> David
>
>
> [1] On x86_64 an int is 4 bytes and a pointer is 8 bytes. Looking at
> the
> contents of the 8 bytes starting at &(shmem_ptr->shmem_comm_rank), they
> appear to be a valid pointer value similar to other pointer values I see
> in
> this core dump. I do not know what this pointer points to (or pointed
> to in
> the past). We get shmem_comm_rank interpreted as a large negative
> number
> simply because the MSbit of the first four bytes happens to be set.
>
> I think it is incontrovertible that these eight bytes hold a pointer
> value
> that was at some point valid. This value could have been written to
> memory
> before the *MPIR_COMMUNICATOR was allocated (presumably part of an
> object
> that was free()'d). This is the hypothesis I explore above.
>
> It's also possible that this pointer was written to those eight bytes
> *after*
> the *MPIR_COMMUNICATOR was created. That is, someone is stomping on our
>
> structure. If that is the case, we should still see segfaults after
> fixing
> the failure to initialize shmem_comm_rank. We're doing runs right now
> in
> which shmem_comm_rank is also initialized (to 0) in the "else" clause,
> to
> check this possibility.
>
> The final possibility is that a legitimate user of this structure is
> writing
> this pointer value inappropriately. I think this is very unlikely,
> assuming
> this problem is not caused by a compiler bug, because the source code
> only
> writes to shmem_coll_rank in one place that I can see, and the code
> logically
> can only write an integer value.
>
> Regardless of the outcome of those tests, however, it is definitely a
> bug not
> to initialize shmem_comm_rank before it is used, unless I'm missing
> something.
>
>
> David Kewley
> Dell Infrastructure Consulting Services
> Onsite Engineer at the Maui HPC Center
> Cell: 602-460-7617
> David_Kewley at Dell.com
>
> Dell Services: http://www.dell.com/services/
> How am I doing? Email my manager Russell_Kelly at Dell.com with any
> feedback.
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
More information about the mvapich-discuss
mailing list