[mvapich-discuss] MV2_ENABLE_AFFINITY? MVAPICH2 2.1

Novosielski, Ryan novosirj at ca.rutgers.edu
Tue Dec 15 13:25:23 EST 2015


Hi all,

I'm using MVAPICH2 with SLURM's PMI2 interface. I'm therefore not using mpirun/mpiexec at all. A user of mine is running some GPU jobs, which require very small numbers of CPU's. So he's frequently not using the whole node, and frequently running more than one job. MVAPICH2's affinity stubbornly forces the jobs to bind to the same processors. The solution is to turn affinity off.

I have some questions about this: 

1) Is there an imaginable scenario where, running with SLURM, I could ever want this feature enabled? Should I somehow look at disabling it system-wide or in the MVAPICH2 compile? 
2) If MVAPICH2 can't tell that a processor is already being used at 100%, how can this feature ever work correctly? Just curious of the use case under a different setting. Is it not meant to co-exist, two nodes on the same job? 
3) I'd like this to be easy for the users. Should I just turn it off in the module that is loaded for MVAPICH2 to prevent this from being an issue?
4) Any thought to whether integrating cgroups to SLURM might solve the problem (eg. SLURM won't even let MVAPICH2 see the other CPUs, so affinity is a non-issue)?

I'd welcome any other advice other sites have about this.

--
____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
 || \\UTGERS      |---------------------*O*---------------------
 ||_// Biomedical | Ryan Novosielski - Senior Technologist
 || \\ and Health | novosirj at rutgers.edu - 973/972.0922 (2x0922)
 ||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
      `'



More information about the mvapich-discuss mailing list