Configurations
Attention
The outputs below are displayed as an example of format and not for their content: the latter may change according to the computing platform maintenance modifications.
The user is invited to run the commands himself on the interactive servers to obtain up-to-date information.
User information
There is a distinction between the notion of group and account. The first is the Unix group corresponding to the collaboration the user is member of. It therefore corresponds to an experience or a collaboration in which the user participates. The latter corresponds to the entity that will be charged for the resources the job will use.
To display all the accounts a user is attached to and the QoS the accounts are allowed to:
% sacctmgr show user withassoc <username> format=Account,QOS%50
<username> being the user id.
Note
Generally, the sacctmgr command allows to display and modify all the information related to the accounts. For more details on the command, please refer to the help sacctmgr -h.
The only active account will be set on the user’s main group. For confirmation, or to switch from one default account to another, please refer to the syntax suggested in Account management to change temporarly the main group. To submit on a different account without modyifing the main group, use the -A | --account= option.
Partitions
The partition is a computational resource organizing the nodes together in the same logical entity defined by one or more given specifications (whether physical or linked to resources).
To get a quick overview of the different partitions, you may use the sinfo command:
% sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
htc* up infinite 268 mix ccwslurm[...]
htc_arm up infinite 4 idle ccwslurma[0003-0006]
htc_interactive up infinite 2 idle ccwislurm[0001-0002]
htc_highmem up infinite 1 mix ccwmslurm0001
gpu_v100 up infinite 16 idle ccwgslurm[0100-0115]
gpu_v100_interactive up infinite 1 mix ccwgislurm0100
gpu_h100 up infinite 3 idle ccwgslurm[0200-0202]
gpu_h100_interactive up infinite 1 mix ccwgislurm0200
hpc up infinite 8 mix ccwpslurm[0017-0024]
flash up infinite 1 mix ccwslurm0001
htc_daemon up infinite 1 mix ccwslurm0001
There are three major distinct partitions: htc, hpc and gpu_*, as well as their equivalents for interactive jobs: htc_interactive, hpc_interactive and gpu_*_interactive. Each of these partitions corresponds to one of the three computing platforms described on the page concerning the computing platform. The difference in the gpu_* partitions is explained in the GPU jobs examples.
The
flashpartition, dedicates a whole node to job testing and debug. This partition is limited to 1 hour by its qos.The
htc_highmempartition is dedicated to jobs that need huge memory and is allowed a higher memory limit per jobs.The
htc_daemonpartition generally allows you to run monitoring or orchestrating jobs: very long, but limited in resources. This partition is limited by its qos to 10 jobs per user.The
htc_armpartition allows job submission on ARM processors. It is limited as anhtcpartition.The
hpcpartition allows job submission on the servers connected in Infiniband (see Parallel job example).
Note
In a simple way, single-core and multi-core jobs will be executed in the htc partition, parallel jobs using InfiniBand in the hpc partition, and access to the GPUs will be done through one of the gpu_* partitions. Access to GPU resource is restricted and depends on the resources request made by your computing group. Please contact user support for any additional information.
Details on submission resource limitations are described in the Required parameter limits paragraph.
To display and read a partition detailed configuration you may use scontrol:
% scontrol show partition
PartitionName=htc
AllowGroups=ALL AllowAccounts=ALL AllowQos=normal,nomemlimit,dask
AllocNodes=ALL Default=YES QoS=htc
DefaultTime=NONE DisableRootJobs=YES ExclusiveUser=NO ExclusiveTopo=NO GraceTime=0 Hidden=NO
MaxNodes=1 MaxTime=UNLIMITED MinNodes=0 LLN=YES MaxCPUsPerNode=UNLIMITED MaxCPUsPerSocket=UNLIMITED
NodeSets=htc
Nodes=ccwslurm[0002-0142,0168-0215,0312-0367,2042-2064]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=21312 TotalNodes=268 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerCPU=1024 MaxMemPerNode=UNLIMITED
TRES=cpu=21312,mem=87344000M,node=268,billing=21312
[...]
- The command gives the main characteristics of the partitions:
authorized groups (
AllowGroups) and accounts (AllowAccounts),the default (
QoS) and associated (AllowQos) qualities of service,the available resources and their limits in the partition.
Note
In practice, when submitting a job, we can specify the partition and the account to use with the options --partition and --account respectively. Without any specification, Slurm will opt for htc (the default partition) and the user’s main account.
Nodes
Nodes are the physical machines hosting the computing resources such as CPU and memory. To obtain detailed information of each node on the computing platform, use the command below (example with the node ccwslurm0002; without this specification, the command gives the following amount of information for each node on the platform):
% scontrol show node ccwslurm0002
NodeName=ccwslurm0002 Arch=x86_64 CoresPerSocket=1
CPUAlloc=36 CPUEfctv=64 CPUTot=64 CPULoad=25.59
AvailableFeatures=htc
ActiveFeatures=htc
Gres=(null)
NodeAddr=ccwslurm0002 NodeHostName=ccwslurm0002 Version=25.05.3
OS=Linux 5.14.0-570.58.1.el9_6.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Oct 21 04:15:07 EDT 2025
RealMemory=192000 AllocMem=176978 FreeMem=78522 Sockets=64 Boards=1
MemSpecLimit=6000
State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=htc
BootTime=2025-11-02T13:18:47 SlurmdStartTime=2025-11-25T10:52:59
LastBusyTime=2025-11-02T17:00:05 ResumeAfterTime=None
CfgTRES=cpu=64,mem=187.50G,billing=64
AllocTRES=cpu=36,mem=176978M
CurrentWatts=0 AveWatts=0
Attention
The fields CPUTot and RealMemory give the node’s total CPU and memory hardware limits respectively. Make sure you do not exceed these limits when submitting your jobs. As a rule of thumb, if a job demands more than 200G on memory, it should be submitted on the htc_highmem partition.
For an overview of all the resources available on the computing platform, please refer to the Information on computing platform resources page.
Quality of service
The quality of service (QoS), is a rule associated with a partition or a job that allows it to be altered. It can for example modify the priority of a job, or limit the allocated resources. The command scontrol showed in the partitions paragraph allows also to view the QoS implemented on a given partition.
In order to list the available QoS, you may use the command sacctmgr:
% sacctmgr show qos format=Name,Priority,MaxWall,MaxSubmitPU,MaxTRES
Name Priority MaxWall MaxSubmitPU MaxTRES
---------- ---------- ----------- ----------- -------------
normal 0 7-00:00:00 5000
flash 0 01:00:00 10 mem=150G
gpu 0 7-00:00:00 100
daemon 0 90-00:00:00 10 cpu=1,mem=16G
dask 1000 2-00:00:00
htc 0 mem=150G
interacti+ 0 7-00:00:00 4
Here, we have restricted the output to only the name, priority, execution time, limit of submitted jobs per user and limit of trackable resources (TRES) fields using the format option. The output can be read in the following way:
normal(used withhtc,htc_arm,htc_highmemandhpcpartitions) limits the jobs execution time to a maximum of 7 days,htc(used only withhtcpartitions) completesnormallimiting the requested memory to 150G (see Required parameter limits),gpu(used withgpu_*andgpu_*_interactivepartitions) limits to 100 the number of simultaneous jobs per user,interactive(used with*_interactivepartitions) limits to 4 the number of interactive sessions per user,flash(used only withflashpartition) limits the execution time to 1 hour for up to 10 simultaneous jobs per user. The requested memory is limited to 150G as forhtcQoS,daemon(used withhtc_daemonpartition) is suitable to execute light processes that need to run during a long time. It is limited to 10 simultaneous jobs per user and 1 CPU and 16G of memory per job (see Required parameters limits),dask(used in the context of the Jupyter platform) limits the execution time to 2 days. This QoS is allowed in thehtcpartition as you may see in thescontroloutput in the paragraph Partitions. For the time limits of Dask jobs, please refer to the paragraph Dask parameters.
Note
As a result, upon submission simply set the partition and the QoS will be automatically set.