PERUN Supercomputer – Partitions Overview¶

About PERUN Partitions

PERUN uses two main partitions: CPU and GPU.
Each job submitted to Slurm must specify one of these partitions, unless a default is used.

1. What Are Slurm Partitions?¶

A partition in Slurm represents a group of compute nodes with similar characteristics or usage rules.
Partitions define:

hardware constraints (CPU cores, GPUs, memory)
job limitations (max runtime, max cores, number of nodes)
resource availability and priority
access to specialized hardware (e.g., GPU nodes)

Tip — Always choose the correct partition

CPU workloads should run in the cpu partition.
GPU or AI workloads must run in the gpu partition.

2. Available PERUN Partitions¶

PERUN defines two primary partitions:

Partition	Nodes	Time Limit	Max Job Size	GPUs	Purpose
CPU	CPU compute nodes	unlimited	Up to system limits	0	Standard HPC workloads
GPU	GPU nodes (H200)	unlimited	Up to 8 GPUs per node	8 per node	AI, ML, GPU-accelerated simulations

Example — Selecting a Partition

sbatch -p CPU job.sh

sbatch -p GPU gpu_job.sh

3. Viewing Partition Information¶

You can inspect partitions with:

Basic Slurm overview¶

sinfo

Detailed partition definitions¶

scontrol show partitions

Example sinfo Output Snippet

PARTITION AVAIL TIMELIMIT  NODES STATE  NODELIST
cpu        up   1-00:00:00   32  idle   n[01-32]
gpu        up   2-00:00:00    4  mix    n[01-26]

Note

Node status values: - idle → ready to run jobs
- alloc → currently running jobs
- mix → partially allocated
- down/drain → node unavailable

4. Choosing the Right Partition¶

Use the cpu partition when:¶

running multi-core CPU jobs
performing scientific simulations
running general HPC workloads

Use the gpu partition when:¶

training machine learning / deep learning models
performing GPU-accelerated workload (CUDA, PyTorch, TensorFlow)
requiring NVIDIA H200 performance

Warning — GPU misuse

Jobs without GPU requirements should not run on the GPU partition.

5. Submitting Jobs to a Partition¶

CPU job example¶

#!/bin/bash
#SBATCH -p cpu
#SBATCH -n 32
#SBATCH -t 24:00:00

python simulation.py

GPU job example¶

#!/bin/bash
#SBATCH -p gpu
#SBATCH --gres=gpu:4
#SBATCH -t 48:00:00

python train.py

Important

Failing to specify --gres=gpu:<num> in the GPU partition will result in no GPUs being allocated.

6. Walltime and Efficiency¶

Use seff to check job efficiency

After a job completes, run:

seff <jobid>

This reports: - CPU and memory efficiency
- actual elapsed time
- estimated resource usage

Why walltime matters¶

Jobs with too-high time limits wait longer in the queue.
Shorter jobs are often scheduled earlier.
Improper walltime estimates decrease cluster efficiency.

Efficient walltime use

If your job usually finishes in 3 hours, do not request 24 hours.

7. Summary¶

PERUN provides two main Slurm partitions: cpu and gpu.
Each has different resource limits and intended workloads.
Correct partition selection improves job scheduling and cluster efficiency.
Use sinfo and scontrol to inspect resources.
Always specify GPUs explicitly when using the gpu partition.