PERUN Supercomputer – Partitions Overview¶
About PERUN Partitions
PERUN uses two main partitions: CPU and GPU.
Each job submitted to Slurm must specify one of these partitions, unless a default is used.
1. What Are Slurm Partitions?¶
A partition in Slurm represents a group of compute nodes with similar characteristics or usage rules.
Partitions define:
- hardware constraints (CPU cores, GPUs, memory)
- job limitations (max runtime, max cores, number of nodes)
- resource availability and priority
- access to specialized hardware (e.g., GPU nodes)
Tip — Always choose the correct partition
CPU workloads should run in the cpu partition.
GPU or AI workloads must run in the gpu partition.
2. Available PERUN Partitions¶
PERUN defines two primary partitions:
| Partition | Nodes | Time Limit | Max Job Size | GPUs | Purpose |
|---|---|---|---|---|---|
| cpu | CPU compute nodes | unlimited | Up to system limits | 0 | Standard HPC workloads |
| gpu | GPU nodes (H200) | unlimited | Up to 8 GPUs per node | 8 per node | AI, ML, GPU-accelerated simulations |
3. Viewing Partition Information¶
You can inspect partitions with:
Basic Slurm overview¶
Detailed partition definitions¶
Example sinfo Output Snippet
Note
Node status values:
- idle → ready to run jobs
- alloc → currently running jobs
- mix → partially allocated
- down/drain → node unavailable
4. Choosing the Right Partition¶
Use the cpu partition when:¶
- running multi-core CPU jobs
- performing scientific simulations
- running general HPC workloads
Use the gpu partition when:¶
- training machine learning / deep learning models
- performing GPU-accelerated workload (CUDA, PyTorch, TensorFlow)
- requiring NVIDIA H200 performance
Warning — GPU misuse
Jobs without GPU requirements should not run on the GPU partition.
5. Submitting Jobs to a Partition¶
CPU job example¶
GPU job example¶
Important
Failing to specify --gres=gpu:<num> in the GPU partition will result in no GPUs being allocated.
6. Walltime and Efficiency¶
Use seff to check job efficiency
After a job completes, run:
This reports: - CPU and memory efficiency- actual elapsed time
- estimated resource usage
Why walltime matters¶
- Jobs with too-high time limits wait longer in the queue.
- Shorter jobs are often scheduled earlier.
- Improper walltime estimates decrease cluster efficiency.
Efficient walltime use
If your job usually finishes in 3 hours, do not request 24 hours.
7. Summary¶
- PERUN provides two main Slurm partitions:
cpuandgpu. - Each has different resource limits and intended workloads.
- Correct partition selection improves job scheduling and cluster efficiency.
- Use
sinfoandscontrolto inspect resources. - Always specify GPUs explicitly when using the gpu partition.