Why do I need MV2_CPU_MAPPING? (was :Re: [mvapich-discuss]
MV2_CPU_MAPPING doesn't work on all the nodes)
Craig Tierney
craig.tierney at noaa.gov
Mon Jun 25 13:45:30 EDT 2012
Mvapich2 development team,
This thread got me thinking again regarding one issue I have always had with process affinity and mvapich2. Why do even need to use MV2_CPU_MAPPING? For 99.9% of the cases that I
could ever think of for MPI applications, there is one correct way to layout processes. If I have the machine file, I would know what to do. If I have OMP_NUM_THREADS, I would know
what to do. If for some reason I need a different mapping on different hosts, I would be able to deduce the correct layout from the machine file (which we do need to do and
MV2_CPU_MAPPING cannot support this).
The only rule I can think of is distribute process evenly across sockets, specify memory affinity should try and stay local. The only other choice there is should you use a scatter
or compact distribution. That could be specified by a variable.
Of course there are many things that could complicate my "simple" view of the world. Besides launching multiple jobs on a node, what are they and how often do they really happen?
For multiple jobs, you would probably through out affinity because there would be no consistent way by a user to ensure that two jobs are not stepping on each other (you would need a
batch system to do this).
Other MPI stacks do not require the use of something like MV2_CPU_MAPPING (Intel MPI and SGI MPI have methods built in). Honestly most of my users wouldn't get the use of
MV2_CPU_MAPPING correct. This is something the system should be doing for them and not figure it out for themselves.
Thanks,
Craig
More information about the mvapich-discuss
mailing list