[mvapich-discuss] Memory error detected by TotalView and Valgrind in MV2-2.1

Adam T. Moody moody20 at llnl.gov
Mon Feb 8 21:13:48 EST 2016


I should add that our custom PMI library bails out with an error if you 
call PMI_KVS_Get_value_length_max() before calling PMI_Init().  So 
PMI_KVS_Get_value_length_max() returns an error (PMI_ERR_INIT) and it 
does not modify the output parameter val.  The malloc then fails because 
val is not initialized, and it can take on random values.

I see that the MV2 code does not check the PMI return code here.
-Adam


Adam T. Moody wrote:

> Hello MVAPICH team,
> We have two different memory debugging tools pointing to an error 
> around line 299 in src/mpid/ch3/src/mpid_init.c:
>
>    /* Create the string that will cache the last group of failed 
> processes
>     * we received from PMI */
>    UPMI_KVS_GET_VALUE_LENGTH_MAX(&val);
>    MPIDI_failed_procs_string = MPIU_Malloc(sizeof(char) * (val+1));
>
> Both tools are reporting that malloc is being called with a large 
> negative value, implying that val is negative here.
>
> We have a custom PMI library, and I tracked this down to an issue 
> where PMI_KVS_Get_value_length_max() is being called before PMI_Init().
>
> Do you know if that is valid in PMI?
> -Adam
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss




More information about the mvapich-discuss mailing list