[ASC-Unity] Running on multiple COREs and NODEs

Stewart, Keith stewart.358 at osu.edu
Mon Oct 30 13:25:25 EDT 2023


Unity community,

Efficiency changes were recently made to the Unity scheduler that MAY have affected your multi-core/multi-node jobs.

When submitting parallel (multi-core,multi-node,OMP,MPI) jobs please use these guidelines and terms.

Know when to use:
--cpus-per-task=#    (used for Multithreaded jobs this is used for a single node multi-core job)

–ntasks=#              (used for MPI usually over many nodes)
--ntasks-per-node=# (MPI tasks)
--nodes=#
·         you want one process that can use 16 cores for multithreading: --ntasks=1 --cpus-per-task=16 (this should be one of the most common requests)
·         you use mpi and do not care about where those cores are distributed: --ntasks=16
·         you want to launch 16 independent processes (no communication): --ntasks=16
·         you want those cores to spread across distinct nodes: --ntasks=16 and --ntasks-per-node=1 or --ntasks=16 and --nodes=16
·         you want those cores to spread across distinct nodes and no interference from other jobs: --ntasks=16 --nodes=16 --exclusive
·         you want 16 processes to spread across 8 nodes to have two processes per node: --ntasks=16 --ntasks-per-node=2
·         you want 16 processes to stay on the same node: --ntasks=16 --ntasks-per-node=16
·         you want 4 processes that can use 4 cores each for multithreading: --ntasks=4 --cpus-per-task=4

See:
https://stackoverflow.com/questions/51139711/hpc-cluster-select-the-number-of-cpus-and-threads-in-slurm-sbatch
https://support.ceci-hpc.be/doc/_contents/SubmittingJobs/SlurmFAQ.html#Q05

As always please use “seff $JOBNUM” after your job completes and you can see your efficiency on number of nodes, cores and memory.  IT is very possible to reserve MORE THAN you need, and your program may never use the extra memory, cores and nodes. (wasting resources)

Example:

Job ID: 570XXXXXX

Cluster: unity

User/Group: yourname.n/group

State: COMPLETED (exit code 0)

Cores: 1

CPU Utilized: 00:59:38

CPU Efficiency: 99.94% of 00:59:40 core-walltime

Job Wall-clock time: 00:59:40

Memory Utilized: 25.60 MB

Memory Efficiency: 2.50% of 1.00 GB


In this case, a single CPU was very effectively used and memory was over requested. (This is fine in this case)

Keith


--
Keith A Stewart,
Senior HPC/Scientific Computing Engineer
College of Arts and Sciences Arts and Sciences Technology Services
614-688-8291
Help link: service/help request<http://go.osu.edu/asctechhelp>
stewart.358 at osu.edu<mailto:stewart.358 at osu.edu> osu.edu<http://osu.edu/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/unity-cluster-announce/attachments/20231030/9e7ee44e/attachment.html>


More information about the Unity-cluster-announce mailing list