[mvapich-discuss] Possibly undesirable mvapich "feature" (was Possible bug)

Lei Chai chai.15 at osu.edu
Thu Sep 25 13:28:11 EDT 2008


Yes, disabling cpu affinity will lead to mpi tasks hopping among the 
cores. If you are using Intel platform the performance won't be affected 
much. If you are using AMD NUMA architecture there might be some 
performance difference since some cores may need to access remote memory 
sometimes. It also depends on the application, such as data access 
patterns etc. So I don't have an accurate estimate, but I guess the 
difference won't be too much.

Lei


Laurence Marks wrote:
> Thanks. I thought something like this might be the answer.
>
> However, before I set this as a global option, will doing this lead to
> the mpi tasks on the compute nodes hopping among the cores and slowing
> the calculation down? (Since each job takes 60-90 minutes testing
> different options is tedious.)
>
> On Thu, Sep 25, 2008 at 11:54 AM, Lei Chai <chai.15 at osu.edu> wrote:
>   
>> Hi Laurence,
>>
>> By default, mvapich uses cpu affinity and tries to use cpu's starting from
>> cpu 0. To solve your problem, there are two options:
>>
>> - Use the VIADEV_CPU_MAPPING env variable as you mentioned. Map to different
>> cpu sets for different MPI jobs.
>> - Use the VIADEV_USE_AFFINITY=0 env variable to disable cpu affinity. The OS
>> will schedule the MPI jobs on different cpu's.
>>
>> Hope this helps.
>>
>> Lei
>>
>>
>> Laurence Marks wrote:
>>     
>>> I think I may have partially resolved my previous problem
>>>
>>> (http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2008-September/001920.html
>>> ), but not completely.
>>>
>>> One of the engineers at the company that sold me the cluster pointed
>>> out that the first node running the job (using 8 cores) was doing a
>>> little swap, even though the mpi job itself was not requiring swap. I
>>> suspect that doing I/O and general other OS tasks associated with
>>> communicating from the 1st core to all the others was leading to this
>>> and causing problems. I can resolve this by running with the first
>>> entry in the machines file on my head node, then everything is OK.
>>>
>>> Unfortunately this leads to another problem. If I have two mpi jobs
>>> both using one core on the head node, instead of using separate cores
>>> they both use the same one! I suspect that this is a design feature,
>>> i.e. to use the first core unless something else has been specified
>>> with VIADEV_CPU_MAPPING or similar. I wonder if there is any way
>>> around this short of specifying different mappings for different jobs
>>> which would become a bit of a nightmare since individual users (i.e.
>>> my students) would have to get it right. An alternative is running
>>> with 7 cores on the first machine to leave some free CPU for OS
>>> operations, but this is inefficient.
>>>
>>>
>>>       
>>     
>
>
>
>   



More information about the mvapich-discuss mailing list