[mvapich-discuss] Request to detect need for --with-ch3-rank-bits=32

Adam T. Moody moody20 at llnl.gov
Tue Sep 6 20:28:10 EDT 2016


Hi Jonathan,
I didn't have a chance to look at the code.  Does MV2-2.2rc2 now include 
a catch for this problem?
Thanks,
-Adam


On 06/16/2016 12:20 PM, Jonathan Perkins wrote:
> Hi Adam, that sounds like a good idea.  We'll take a look into it and check
> with MPICH as well.
>
> On Thu, Jun 16, 2016 at 2:17 PM Adam T. Moody <moody20 at llnl.gov> wrote:
>
>> Hi MVAPICH team,
>> We've got a new system that we're bringing online with +80k cores.  I
>> hit a hang in the first call to MPI_Gather in mpiBench above a certain
>> node count.  After a binary search, I found that things ran fine at
>> 32768 procs but hang at 32769 procs or larger.  This suggested we were
>> overflowing some bit field, and that led me to CH3_RANK_BITS which
>> apparently defaults to 16 bits unless you throw the
>> --with-ch3-rank-bits=32 flag during configure.
>>
>> I think it's fine to default to 16 bits here, since most users will not
>> need the larger rank count, and I'm guessing there could be some
>> performance penalty when using 32 bits (if not, perhaps just bump the
>> default to 32).
>>
>> It would be helpful to detect this problem in MPI and throw a fatal
>> error pointing users to the option.  Would you please add a patch like
>> the following:
>>
>> #include mpichconf.h
>>
>> #if CH3_RANK_BITS == 16
>> if (numprocs > 32768) {
>>    // inform user about --with-ch3-rank-bits=32 configure option
>>    // bail out with fatal error (it's not going to work anyway)
>> }
>> #endif
>>
>> This could go in MPI_Init, or to handle dynamic proc support, it should
>> go into comm creation.
>>
>> If the upstream MPICH does not already have something like this, let's
>> also elevate this request up the chain.
>> Thanks!
>> -Adam
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>



More information about the mvapich-discuss mailing list