Why do I need MV2_CPU_MAPPING? (was :Re: [mvapich-discuss] MV2_CPU_MAPPING doesn't work on all the nodes)

Craig Tierney craig.tierney at noaa.gov
Mon Jun 25 15:13:57 EDT 2012


On 6/25/12 12:55 PM, Devendar Bureddy wrote:
> Hi Craig
>
> We also do not recommend to use MV2_CPU_MAPPING explicitly until
> unless if user want some specific layout of cpu affinity across all
> the nodes and do not want to modify hostfile.  MVAPICH2 also
> automatically decides the affinity internally and users should not
> worry much about it. There is also some high level abstract run-time
> options (MV2_CPU_BINDING_POLICY,  MV2_CPU_BINDING_LEVEL)  to change
> the default affinity settings if required. There is certainly no need
> to explicitly specify affinity with MV2_CPU_MAPPING
>
> -Devendar
>

Devnedar,

Thanks for the quick response.  I am currently building mvapich-1.8 for a new Sandy Bridge cluster and will be testing all this stuff out again.  We don't have many hybrid-codes, but 
I could never leave it up to mvapich2 (from 1.4 to 1.6 era) and get good performance.  I had to write my own wrapper script that got the processor and memory affinity correct.

When was MV2_CPU_BINDING_LEVEL added?  I am now starting to build version 1.8 for the new cluster.  We will backport the build to our older systems.  It seems that the solution to my 
issue is by setting MV2_CPU_BINDING_LEVEL to socket or numanode.

So it seems that you fixed my problem before I asked it!  MV2_CPU_BINDING_LEVEL seems to be the answer.

Thanks,
Craig


> On Mon, Jun 25, 2012 at 1:45 PM, Craig Tierney<craig.tierney at noaa.gov>  wrote:
>> Mvapich2 development team,
>>
>> This thread got me thinking again regarding one issue I have always had with
>> process affinity and mvapich2.  Why do even need to use MV2_CPU_MAPPING?
>>   For 99.9% of the cases that I could ever think of for MPI applications,
>> there is one correct way to layout processes.  If I have the machine file, I
>> would know what to do.  If I have OMP_NUM_THREADS, I would know what to do.
>>   If for some reason I need a different mapping on different hosts, I would
>> be able to deduce the correct layout from the machine file (which we do need
>> to do and MV2_CPU_MAPPING cannot support this).
>>
>> The only rule I can think of is distribute process evenly across sockets,
>> specify memory affinity should try and stay local.  The only other choice
>> there is should you use a scatter or compact distribution.  That could be
>> specified by a variable.
>>
>> Of course there are many things that could complicate my "simple" view of
>> the world.  Besides launching multiple jobs on a node, what are they and how
>> often do they really happen? For multiple jobs, you would probably through
>> out affinity because there would be no consistent way by a user to ensure
>> that two jobs are not stepping on each other (you would need a batch system
>> to do this).
>>
>> Other MPI stacks do not require the use of something like MV2_CPU_MAPPING
>> (Intel MPI and SGI MPI have methods built in).  Honestly most of my users
>> wouldn't get the use of MV2_CPU_MAPPING correct.  This is something the
>> system should be doing for them and not figure it out for themselves.
>>
>> Thanks,
>> Craig
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
>


More information about the mvapich-discuss mailing list