[mvapich-discuss] Possibly undesirable mvapich "feature" (was Possible bug)

Laurence Marks L-marks at northwestern.edu
Thu Sep 25 13:09:49 EDT 2008


Thanks. I thought something like this might be the answer.

However, before I set this as a global option, will doing this lead to
the mpi tasks on the compute nodes hopping among the cores and slowing
the calculation down? (Since each job takes 60-90 minutes testing
different options is tedious.)

On Thu, Sep 25, 2008 at 11:54 AM, Lei Chai <chai.15 at osu.edu> wrote:
> Hi Laurence,
>
> By default, mvapich uses cpu affinity and tries to use cpu's starting from
> cpu 0. To solve your problem, there are two options:
>
> - Use the VIADEV_CPU_MAPPING env variable as you mentioned. Map to different
> cpu sets for different MPI jobs.
> - Use the VIADEV_USE_AFFINITY=0 env variable to disable cpu affinity. The OS
> will schedule the MPI jobs on different cpu's.
>
> Hope this helps.
>
> Lei
>
>
> Laurence Marks wrote:
>>
>> I think I may have partially resolved my previous problem
>>
>> (http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2008-September/001920.html
>> ), but not completely.
>>
>> One of the engineers at the company that sold me the cluster pointed
>> out that the first node running the job (using 8 cores) was doing a
>> little swap, even though the mpi job itself was not requiring swap. I
>> suspect that doing I/O and general other OS tasks associated with
>> communicating from the 1st core to all the others was leading to this
>> and causing problems. I can resolve this by running with the first
>> entry in the machines file on my head node, then everything is OK.
>>
>> Unfortunately this leads to another problem. If I have two mpi jobs
>> both using one core on the head node, instead of using separate cores
>> they both use the same one! I suspect that this is a design feature,
>> i.e. to use the first core unless something else has been specified
>> with VIADEV_CPU_MAPPING or similar. I wonder if there is any way
>> around this short of specifying different mappings for different jobs
>> which would become a bit of a nightmare since individual users (i.e.
>> my students) would have to get it right. An alternative is running
>> with 7 cores on the first machine to leave some free CPU for OS
>> operations, but this is inefficient.
>>
>>
>
>



-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/IUCR_CED


More information about the mvapich-discuss mailing list