SLURM

Shaheen uses the Simple Linux Utility for Resource Management (SLURM) system for job scheduling.

Aliases

Since the status and kill commands for SLURM differ from PBS, it is useful to set up common aliases, for example in .bashrc:

# Terminate a job when given its job id
alias jkill='scancel'
# Print user's jobs and status
alias jstat='squeue -u $USER'
# Start an interactive session on a compute node
alias jinter='srun -u --pty bash -i'
# Print out jobs that have run since a start date
alias jhist='sacct -u $USER --format="JobID%20,JobName%30,Partition,Account,AllocCPUS,State,ExitCode" -S'

The last alias, jhist, prints out jobs that have been completed since a given start date:

$ jhist 2020-11-13
               JobID                        JobName  Partition    Account  AllocCPUS      State ExitCode
-------------------- ------------------------------ ---------- ---------- ---------- ---------- --------
            16435549 run.FHIST_BGC.f09_d025.084.e03      workq      k1421        768  COMPLETED      0:0
      16435549.batch                          batch                 k1421         64  COMPLETED      0:0
          16435549.0                       cesm.exe                 k1421        384  COMPLETED      0:0

Job Queue

The standard job queue on Shaheen is workq.

Resource Binding

SLURM is highly configurable with respect to its ability to bind tasks to various resources such as “threads, cores, sockets, NUMA or boards.”

Node Configuration

To see the configuration of Shaheen’s Cray XC40 nodes, invoking scontrol show nodes will print out the node configuration:

$ scontrol show nodes
NodeName=nid07679 Arch=x86_64 CoresPerSocket=16
CPUAlloc=64 CPUTot=64 CPULoad=64.00
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=craynetwork:4
NodeAddr=nid07679 NodeHostName=nid07679 Version=20.02.6
OS=Linux 4.12.14-150.17_5.0.91-cray_ari_c #1 SMP Wed May 27 02:24:01 UTC 2020 (6b16d42)
RealMemory=128803 AllocMem=128448 FreeMem=123036 Sockets=2 Boards=1
State=ALLOCATED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=workq,72hours
BootTime=2020-12-03T10:05:38 SlurmdStartTime=2020-12-03T10:22:30
CfgTRES=cpu=64,mem=128803M,billing=64
AllocTRES=cpu=64,mem=128448M
CapWatts=n/a
CurrentWatts=47 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

When counting CPUs per node, there are:

  • Sockets=2

  • CoresPerSocket=16

  • ThreadsPerCore=2

Thus cpu=64. Should we set: --ntasks-per-node=32 or --ntasks-per-node=64?

Default Settings

To see the default settings for SLURM, invoking the scontrol show config command will print all of the settings:

$ scontrol show config
...
DefMemPerCPU            = 2007
...
MaxTasksPerNode         = 512
...

We see, for example, that DefMemPerCPU = 2007 (which is reported in megabytes, M) and there are 64 cpu/node, thus the memory per node is 2007M*64=128448M which is consisent with the printout of scontrol show nodes above.

Analogs for Cheyenne

While Cheyenne uses PBS rather than SLURM it’s usefult to have their settings as well. The analagous command for scontrol show nodes is pbsnodes -a.

$ pbsnodes -a
r8i5n1
Mom = r8i5n1.ib0.cheyenne.ucar.edu
ntype = PBS
state = free
pcpus = 72
resources_available.arch = linux
resources_available.host = r8i5n1
resources_available.iru = r8i5
resources_available.iru2 = r8i4i5
resources_available.mem = 131567260kb
resources_available.ncpus = 72
resources_available.nodetype = largemem
resources_available.Qlist = system,special,ampsrt,capability,premium,regular,standby,economy,small
resources_available.rack = r8
resources_available.rack16 = r1r2r3r4r5r6r7r8r9r10r11r12r13r14r15r16
resources_available.rack2 = r15r16
resources_available.rack4 = r13r14r15r16
resources_available.rack8 = r9r10r11r12r13r14r15r16
resources_available.switch = r8i5a0s0
resources_available.switchblade = r8i5s0
resources_available.vnode = r8i5n1
resources_assigned.accelerator_memory = 0kb
resources_assigned.hbmem = 0kb
resources_assigned.mem = 0kb
resources_assigned.naccelerators = 0
resources_assigned.ncpus = 0
resources_assigned.vmem = 0kb
comment =
resv_enable = True
sharing = default_shared
license = l
last_state_change_time = Mon Dec 21 17:14:09 2020
last_used_time = Mon Dec 21 17:14:09 2020

We haven’t been able to find the analagous command for scontrol show config.