[mvapich-discuss] Disabling CPU affinity in PSM

Adam T. Moody moody20 at llnl.gov
Mon Mar 2 12:51:10 EST 2015


Great.  Thanks, Hari.
-Adam

Hari Subramoni wrote:

>Hi Adam,
>
>Thanks for identifying the issue and providing the patch. We have taken it
>into the code base. It should be available with our upcoming RC2 release.
>
>Regards,
>Hari.
>
>On Fri, Feb 27, 2015 at 7:55 PM, Adam T. Moody <moody20 at llnl.gov> wrote:
>
>  
>
>>Hello MVAPICH team,
>>I've attached a patch with two modifications for the PSM channel:
>>
>>   - disable PSM from setting CPU affinity
>>   - install PSM error handler to print more verbose error messages
>>
>>By default during psm_ep_open(), PSM sets CPU affinity on a process if
>>it's not already set.  However the affinity assigned by PSM causes some
>>problems, especially for singleton MPI jobs, i.e., those run w/o mpirun or
>>srun.  PSM binds each process based on its rank so that it binds all
>>singleton jobs to core 0.  This causes problems when running multiple
>>singleton jobs on the same node, since every job is bound to the same core.
>>
>>Typically, people will rely on the process launcher like mpirun or srun to
>>set CPU affinity for each MPI process.  Otherwise, they are most likely
>>running singleton MPI jobs, in which case, they probably don't want to bind
>>all such jobs to the same core.  If someone does want to bind a singleton
>>job, one can use a command like taskset or numactl, which then gives one
>>full control over which CPU the process is bound to.
>>
>>The attached patch disables PSM affinity by specifying
>>PSM_EP_OPEN_AFFINITY_SKIP as an option during psm_ep_open().
>>
>>This patch also installs a PSM error handler to print more verbose PSM
>>error messages.  Currently, our error messages do not provide enough
>>context so that we often see the same message printed for what may be many
>>different errors.  This patch prints an additional error string with more
>>info provided by PSM.
>>-Adam
>>
>>_______________________________________________
>>mvapich-discuss mailing list
>>mvapich-discuss at cse.ohio-state.edu
>>http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>>    
>>
>
>  
>



More information about the mvapich-discuss mailing list