Configurations

Attention

The outputs below are displayed as an example of format and not for their content: the latter may change according to the computing platform maintenance modifications.

The user is invited to run the commands himself on the interactive servers to obtain up-to-date information.

User information

There is a distinction between the notion of group and account. The first is the Unix group corresponding to the collaboration the user is member of. It therefore corresponds to an experience or a collaboration in which the user participates. The latter corresponds to the entity that will be charged for the resources the job will use.

To display all the accounts a user is attached to and the QoS the accounts are allowed to:

% sacctmgr show user withassoc <username> format=Account,QOS%50

<username> being the user id.

Note

Generally, the sacctmgr command allows to display and modify all the information related to the accounts. For more details on the command, please refer to the help sacctmgr -h.

The only active account will be set on the user’s main group. For confirmation, or to switch from one default account to another, please refer to the syntax suggested in Account management to change temporarly the main group. To submit on a different account without modyifing the main group, use the -A | --account= option.

Partitions

The partition is a computational resource organizing the nodes together in the same logical entity defined by one or more given specifications (whether physical or linked to resources).

To get a quick overview of the different partitions, you may use the sinfo command:

% sinfo
PARTITION            AVAIL  TIMELIMIT  NODES  STATE NODELIST
htc*                    up   infinite    268    mix ccwslurm[...]
htc_arm                 up   infinite      4   idle ccwslurma[0003-0006]
htc_interactive         up   infinite      2   idle ccwislurm[0001-0002]
htc_highmem             up   infinite      1    mix ccwmslurm0001
gpu_v100                up   infinite     16   idle ccwgslurm[0100-0115]
gpu_v100_interactive    up   infinite      1    mix ccwgislurm0100
gpu_h100                up   infinite      3   idle ccwgslurm[0200-0202]
gpu_h100_interactive    up   infinite      1    mix ccwgislurm0200
hpc                     up   infinite      8    mix ccwpslurm[0017-0024]
flash                   up   infinite      1    mix ccwslurm0001
htc_daemon              up   infinite      1    mix ccwslurm0001

There are three major distinct partitions: htc, hpc and gpu_*, as well as their equivalents for interactive jobs: htc_interactive, hpc_interactive and gpu_*_interactive. Each of these partitions corresponds to one of the three computing platforms described on the page concerning the computing platform. The difference in the gpu_* partitions is explained in the GPU jobs examples.

  • The flash partition, dedicates a whole node to job testing and debug. This partition is limited to 1 hour by its qos.

  • The htc_highmem partition is dedicated to jobs that need huge memory and is allowed a higher memory limit per jobs.

  • The htc_daemon partition generally allows you to run monitoring or orchestrating jobs: very long, but limited in resources. This partition is limited by its qos to 10 jobs per user.

  • The htc_arm partition allows job submission on ARM processors. It is limited as an htc partition.

  • The hpc partition allows job submission on the servers connected in Infiniband (see Parallel job example).

Note

In a simple way, single-core and multi-core jobs will be executed in the htc partition, parallel jobs using InfiniBand in the hpc partition, and access to the GPUs will be done through one of the gpu_* partitions. Access to GPU resource is restricted and depends on the resources request made by your computing group. Please contact user support for any additional information.

Details on submission resource limitations are described in the Required parameter limits paragraph.

To display and read a partition detailed configuration you may use scontrol:

% scontrol show partition
PartitionName=htc
   AllowGroups=ALL AllowAccounts=ALL AllowQos=normal,nomemlimit,dask
   AllocNodes=ALL Default=YES QoS=htc
   DefaultTime=NONE DisableRootJobs=YES ExclusiveUser=NO ExclusiveTopo=NO GraceTime=0 Hidden=NO
   MaxNodes=1 MaxTime=UNLIMITED MinNodes=0 LLN=YES MaxCPUsPerNode=UNLIMITED MaxCPUsPerSocket=UNLIMITED
   NodeSets=htc
   Nodes=ccwslurm[0002-0142,0168-0215,0312-0367,2042-2064]
   PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
   OverTimeLimit=NONE PreemptMode=OFF
   State=UP TotalCPUs=21312 TotalNodes=268 SelectTypeParameters=NONE
   JobDefaults=(null)
   DefMemPerCPU=1024 MaxMemPerNode=UNLIMITED
   TRES=cpu=21312,mem=87344000M,node=268,billing=21312

 [...]
The command gives the main characteristics of the partitions:
  • authorized groups (AllowGroups) and accounts (AllowAccounts),

  • the default (QoS) and associated (AllowQos) qualities of service,

  • the available resources and their limits in the partition.

Note

In practice, when submitting a job, we can specify the partition and the account to use with the options --partition and --account respectively. Without any specification, Slurm will opt for htc (the default partition) and the user’s main account.

Nodes

Nodes are the physical machines hosting the computing resources such as CPU and memory. To obtain detailed information of each node on the computing platform, use the command below (example with the node ccwslurm0002; without this specification, the command gives the following amount of information for each node on the platform):

% scontrol show node ccwslurm0002
NodeName=ccwslurm0002 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=36 CPUEfctv=64 CPUTot=64 CPULoad=25.59
   AvailableFeatures=htc
   ActiveFeatures=htc
   Gres=(null)
   NodeAddr=ccwslurm0002 NodeHostName=ccwslurm0002 Version=25.05.3
   OS=Linux 5.14.0-570.58.1.el9_6.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Oct 21 04:15:07 EDT 2025
   RealMemory=192000 AllocMem=176978 FreeMem=78522 Sockets=64 Boards=1
   MemSpecLimit=6000
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=htc
   BootTime=2025-11-02T13:18:47 SlurmdStartTime=2025-11-25T10:52:59
   LastBusyTime=2025-11-02T17:00:05 ResumeAfterTime=None
   CfgTRES=cpu=64,mem=187.50G,billing=64
   AllocTRES=cpu=36,mem=176978M
   CurrentWatts=0 AveWatts=0

Attention

The fields CPUTot and RealMemory give the node’s total CPU and memory hardware limits respectively. Make sure you do not exceed these limits when submitting your jobs. As a rule of thumb, if a job demands more than 200G on memory, it should be submitted on the htc_highmem partition.

For an overview of all the resources available on the computing platform, please refer to the Information on computing platform resources page.

Quality of service

The quality of service (QoS), is a rule associated with a partition or a job that allows it to be altered. It can for example modify the priority of a job, or limit the allocated resources. The command scontrol showed in the partitions paragraph allows also to view the QoS implemented on a given partition.

In order to list the available QoS, you may use the command sacctmgr:

% sacctmgr show qos format=Name,Priority,MaxWall,MaxSubmitPU,MaxTRES
      Name   Priority     MaxWall MaxSubmitPU       MaxTRES
---------- ---------- ----------- ----------- -------------
    normal          0  7-00:00:00        5000
     flash          0    01:00:00          10      mem=150G
       gpu          0  7-00:00:00         100
    daemon          0 90-00:00:00          10 cpu=1,mem=16G
      dask       1000  2-00:00:00
       htc          0                              mem=150G
interacti+          0  7-00:00:00           4

Here, we have restricted the output to only the name, priority, execution time, limit of submitted jobs per user and limit of trackable resources (TRES) fields using the format option. The output can be read in the following way:

  • normal (used with htc, htc_arm, htc_highmem and hpc partitions) limits the jobs execution time to a maximum of 7 days,

  • htc (used only with htc partitions) completes normal limiting the requested memory to 150G (see Required parameter limits),

  • gpu (used with gpu_* and gpu_*_interactive partitions) limits to 100 the number of simultaneous jobs per user,

  • interactive (used with *_interactive partitions) limits to 4 the number of interactive sessions per user,

  • flash (used only with flash partition) limits the execution time to 1 hour for up to 10 simultaneous jobs per user. The requested memory is limited to 150G as for htc QoS,

  • daemon (used with htc_daemon partition) is suitable to execute light processes that need to run during a long time. It is limited to 10 simultaneous jobs per user and 1 CPU and 16G of memory per job (see Required parameters limits),

  • dask (used in the context of the Jupyter platform) limits the execution time to 2 days. This QoS is allowed in the htc partition as you may see in the scontrol output in the paragraph Partitions. For the time limits of Dask jobs, please refer to the paragraph Dask parameters.

Note

As a result, upon submission simply set the partition and the QoS will be automatically set.