Conda Guide – PERUN Supercomputer¶
What is Conda?
Conda is an open‑source package and environment manager used to install software, manage dependencies, and create isolated environments for different projects. Although originally built for Python, Conda can distribute software written in any language.
1. Conda Distributions¶
There are two primary Conda distributions:
| Distribution | Description |
|---|---|
| Miniconda | Minimal installer containing only Conda and essential modules. Recommended for HPC environments. |
| Anaconda | Includes Conda, a GUI, and hundreds of preinstalled packages. Not recommended on PERUN due to size. |
Recommendation
On HPC systems like PERUN, use Miniconda to avoid unnecessary package bloat.
2. Installing Miniconda¶
Install Miniconda into ~/miniconda3:
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
Initialize Conda:
Restart the terminal to apply changes.
Important
Do not install Miniconda into system directories on PERUN.
Always install it into your home directory.
3. Conda Channels¶
Conda channels are sources from which packages are downloaded.
Default Channels¶
By default, Conda uses the defaults channel maintained by Anaconda Inc.
Adding Channels¶
Add a channel permanently:
Set channel priority:
View configured channels:
Installing from Specific Channels¶
Install a package from a specific channel:
Best Practice
Prefer the conda-forge channel—its packages are more up‑to‑date and compatible.
Common Channels¶
| Channel | Description |
|---|---|
| defaults | Official Anaconda repository |
| conda-forge | Community-maintained, most comprehensive |
| bioconda | Bioinformatics packages |
| pytorch | PyTorch and related packages |
| nvidia | CUDA and GPU-accelerated packages |
4. Managing Conda Environments¶
Creating Environments¶
Create a new environment:
Create with specific Python version:
Create with packages:
Create with specific package versions:
Activating and Deactivating¶
Activate environment:
Deactivate environment:
Base Environment
When you first open a terminal, the base environment is usually active by default.
Listing Environments¶
List all environments:
Cloning Environments¶
Clone an existing environment:
Removing Environments¶
Remove an environment:
Why Use Environments?
Each environment is isolated, preventing version conflicts and dependency issues common in Python projects.
5. Creating Environments from YAML¶
Creating from YAML¶
Create environment from file:
Example YAML file (my-env.yml):
name: myproject
channels:
- conda-forge
- defaults
dependencies:
- python=3.11
- numpy=1.24
- pandas=2.0
- matplotlib
- scikit-learn
- pip
- pip:
- some-pip-only-package
Exporting Environments¶
Export an environment:
Export only explicitly installed packages (recommended):
Export to a specific file:
Example Use Case
Sharing a YAML file ensures collaborators reproduce the exact same environment, improving reproducibility.
6. Installing Packages¶
Basic Installation¶
Install into the active environment:
Install into a specific environment:
Install multiple packages:
Version-Specific Installation¶
Install specific version:
Install version range:
Install latest compatible version:
Installing from Channels¶
Install from conda-forge:
Install from multiple channels:
Searching for Packages¶
Search for packages:
Search in specific channel:
Search with wildcards:
Listing Installed Packages¶
List packages in active environment:
List packages in specific environment:
List packages matching pattern:
Updating Packages¶
Update a specific package:
Update all packages:
Update Conda itself:
Removing Packages¶
Remove a package:
Remove from specific environment:
Avoid Installing Into Base
The base Conda environment can easily break when many packages are installed.
Always create project‑specific environments.
7. Mixing Conda and Pip¶
When to Use Pip¶
Use pip for packages not available in Conda:
Best Practices¶
- Always install Conda packages first, then pip packages
- Include pip in environment when creating:
- Use requirements.txt for pip packages:
Mixed Environment Example¶
name: mixed-env
channels:
- conda-forge
dependencies:
- python=3.11
- numpy
- pandas
- pip
- pip:
- tensorflow
- transformers
Important
Installing packages with pip after Conda can sometimes cause dependency conflicts.
Prefer Conda packages when available.
8. Working with Python Versions¶
Installing Different Python Versions¶
Create environment with specific Python:
conda create --name py39 python=3.9
conda create --name py311 python=3.11
conda create --name py312 python=3.12
Updating Python Version¶
Update Python in existing environment:
Checking Python Version¶
9. HPC-Specific Considerations¶
Storage Quota Management¶
Conda environments can consume significant disk space. Monitor your usage:
Installing in Non-Standard Locations¶
If home directory quota is limited, install to scratch:
# Install Miniconda to scratch
bash Miniconda3-latest-Linux-x86_64.sh -b -p /scratch/username/miniconda3
# Create environments in custom location
conda create --prefix /scratch/username/envs/myenv python=3.11
Activate prefix-based environment:
Using Conda in Job Scripts¶
Example SLURM job script:
#!/bin/bash
#SBATCH --job-name=my_conda_job
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
# Initialize Conda
source ~/miniconda3/etc/profile.d/conda.sh
# Activate environment
conda activate myenv
# Run your Python script
python my_script.py
# Deactivate when done
conda deactivate
Parallel Package Installation¶
Speed up package installation with multiple threads:
Offline Package Installation¶
Download packages on login node, install on compute nodes:
# Download packages
conda install --download-only numpy
# Later, install without downloading
conda install --offline numpy
10. Conda Configuration¶
View Current Configuration¶
Common Configuration Options¶
Disable auto-activation of base environment:
Set default channels:
Change package cache location:
Increase solver timeout for complex environments:
Configuration File¶
Conda settings are stored in ~/.condarc:
channels:
- conda-forge
- defaults
channel_priority: strict
auto_activate_base: false
show_channel_urls: true
11. Troubleshooting¶
Solving Environment Issues¶
If environment solving is slow, try:
# Use libmamba solver (faster)
conda install -n base conda-libmamba-solver
conda config --set solver libmamba
Dependency Conflicts¶
If you encounter conflicts:
# Create new environment instead of updating
conda create --name myenv_new --clone myenv
conda activate myenv_new
conda install problematic-package
Corrupted Environment¶
Remove and recreate:
Package Not Found¶
Try different channels:
Check on Anaconda.org:
Clearing Cache¶
If installations fail due to corrupted cache:
Verification¶
Verify Conda installation:
Check for issues:
12. Cleaning Up Conda Data¶
Conda stores cached packages which may consume significant space.
Clean Commands¶
Clean package cache:
Clean tarballs:
Clean index cache:
Clean everything:
Dry run (see what would be removed):
Checking Disk Usage¶
Check cache size:
Check environment sizes:
Be Careful
conda clean --all permanently removes cached package files.
Only run this command if you are sure you no longer need them.
13. Advanced Tips¶
Creating Minimal Environments¶
Install only what you need:
Using Environment Variables¶
Set environment variables for an environment:
View environment variables:
Building Packages¶
Create your own Conda package:
Conda Run Command¶
Run command in environment without activation:
Useful for automation and scripts.
Package Pinning¶
Prevent package updates by pinning:
Create ~/miniconda3/envs/myenv/conda-meta/pinned:
14. Common Package Installation Examples¶
Scientific Computing¶
Machine Learning (CPU)¶
Deep Learning (PyTorch with CUDA)¶
conda create -n pytorch python=3.11
conda activate pytorch
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
Deep Learning (TensorFlow with CUDA)¶
conda create -n tensorflow python=3.11
conda activate tensorflow
conda install -c conda-forge cudatoolkit=11.8 cudnn=8.6
pip install tensorflow[and-cuda]
Bioinformatics¶
conda create -n bioinfo python=3.11 -c bioconda
conda activate bioinfo
conda install -c bioconda biopython samtools bcftools
Data Visualization¶
Web Development¶
15. Summary¶
- Use Miniconda, not Anaconda, on PERUN
- Create separate environments for every project
- Use
conda-forgechannel for consistent package updates - Install Conda packages first, then pip packages if needed
- Always specify Python version when creating environments
- Export environments as YAML files for reproducibility
- Clean environment caches regularly if storage quota is limited
- Use prefix-based environments if home directory quota is exceeded
- Initialize Conda properly in job scripts
- Use
conda searchto find packages before installing
Quick Reference Commands
# Create environment
conda create -n myenv python=3.11
# Activate/deactivate
conda activate myenv
conda deactivate
# Install packages
conda install numpy pandas -c conda-forge
# List packages
conda list
# Export environment
conda env export > environment.yml
# Clean cache
conda clean --all
# Update Conda
conda update -n base conda