Skip to content

Conda Guide – PERUN Supercomputer

What is Conda?

Conda is an open‑source package and environment manager used to install software, manage dependencies, and create isolated environments for different projects. Although originally built for Python, Conda can distribute software written in any language.


1. Conda Distributions

There are two primary Conda distributions:

Distribution Description
Miniconda Minimal installer containing only Conda and essential modules. Recommended for HPC environments.
Anaconda Includes Conda, a GUI, and hundreds of preinstalled packages. Not recommended on PERUN due to size.

Recommendation

On HPC systems like PERUN, use Miniconda to avoid unnecessary package bloat.


2. Installing Miniconda

Install Miniconda into ~/miniconda3:

mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3

Initialize Conda:

~/miniconda3/bin/conda init bash

Restart the terminal to apply changes.

Important

Do not install Miniconda into system directories on PERUN.
Always install it into your home directory.


3. Conda Channels

Conda channels are sources from which packages are downloaded.

Default Channels

By default, Conda uses the defaults channel maintained by Anaconda Inc.

Adding Channels

Add a channel permanently:

conda config --add channels conda-forge

Set channel priority:

conda config --set channel_priority strict

View configured channels:

conda config --show channels

Installing from Specific Channels

Install a package from a specific channel:

conda install <package> --channel conda-forge
conda install <package> -c conda-forge  # short form

Best Practice

Prefer the conda-forge channel—its packages are more up‑to‑date and compatible.

Common Channels

Channel Description
defaults Official Anaconda repository
conda-forge Community-maintained, most comprehensive
bioconda Bioinformatics packages
pytorch PyTorch and related packages
nvidia CUDA and GPU-accelerated packages

4. Managing Conda Environments

Creating Environments

Create a new environment:

conda create --name myenv

Create with specific Python version:

conda create --name myenv python=3.11

Create with packages:

conda create --name myenv python=3.11 numpy pandas matplotlib

Create with specific package versions:

conda create --name myenv python=3.11 numpy=1.24 pandas=2.0

Activating and Deactivating

Activate environment:

conda activate myenv

Deactivate environment:

conda deactivate

Base Environment

When you first open a terminal, the base environment is usually active by default.

Listing Environments

List all environments:

conda info --envs
conda env list  # alternative

Cloning Environments

Clone an existing environment:

conda create --name newenv --clone myenv

Removing Environments

Remove an environment:

conda remove --name myenv --all

Why Use Environments?

Each environment is isolated, preventing version conflicts and dependency issues common in Python projects.


5. Creating Environments from YAML

Creating from YAML

Create environment from file:

conda env create -f my-env.yml

Example YAML file (my-env.yml):

name: myproject
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - numpy=1.24
  - pandas=2.0
  - matplotlib
  - scikit-learn
  - pip
  - pip:
    - some-pip-only-package

Exporting Environments

Export an environment:

conda env export > my-env.yml

Export only explicitly installed packages (recommended):

conda env export --from-history > my-env.yml

Export to a specific file:

conda env export --name myenv --file myenv.yml

Example Use Case

Sharing a YAML file ensures collaborators reproduce the exact same environment, improving reproducibility.


6. Installing Packages

Basic Installation

Install into the active environment:

conda install matplotlib

Install into a specific environment:

conda install --name myenv matplotlib

Install multiple packages:

conda install numpy pandas scipy

Version-Specific Installation

Install specific version:

conda install numpy=1.24.0

Install version range:

conda install "numpy>=1.20,<1.25"

Install latest compatible version:

conda install numpy=1.24.*

Installing from Channels

Install from conda-forge:

conda install matplotlib --channel conda-forge
conda install matplotlib -c conda-forge  # short form

Install from multiple channels:

conda install pytorch -c pytorch -c conda-forge

Searching for Packages

Search for packages:

conda search numpy

Search in specific channel:

conda search numpy -c conda-forge

Search with wildcards:

conda search "numpy*"

Listing Installed Packages

List packages in active environment:

conda list

List packages in specific environment:

conda list -n myenv

List packages matching pattern:

conda list numpy

Updating Packages

Update a specific package:

conda update numpy

Update all packages:

conda update --all

Update Conda itself:

conda update -n base conda

Removing Packages

Remove a package:

conda remove numpy

Remove from specific environment:

conda remove --name myenv numpy

Avoid Installing Into Base

The base Conda environment can easily break when many packages are installed.
Always create project‑specific environments.


7. Mixing Conda and Pip

When to Use Pip

Use pip for packages not available in Conda:

conda activate myenv
pip install package-name

Best Practices

  1. Always install Conda packages first, then pip packages
  2. Include pip in environment when creating:
    conda create --name myenv python=3.11 pip
    
  3. Use requirements.txt for pip packages:
    pip install -r requirements.txt
    

Mixed Environment Example

name: mixed-env
channels:
  - conda-forge
dependencies:
  - python=3.11
  - numpy
  - pandas
  - pip
  - pip:
    - tensorflow
    - transformers

Important

Installing packages with pip after Conda can sometimes cause dependency conflicts.
Prefer Conda packages when available.


8. Working with Python Versions

Installing Different Python Versions

Create environment with specific Python:

conda create --name py39 python=3.9
conda create --name py311 python=3.11
conda create --name py312 python=3.12

Updating Python Version

Update Python in existing environment:

conda activate myenv
conda install python=3.11

Checking Python Version

python --version
which python

9. HPC-Specific Considerations

Storage Quota Management

Conda environments can consume significant disk space. Monitor your usage:

du -sh ~/miniconda3
du -sh ~/miniconda3/envs/*

Installing in Non-Standard Locations

If home directory quota is limited, install to scratch:

# Install Miniconda to scratch
bash Miniconda3-latest-Linux-x86_64.sh -b -p /scratch/username/miniconda3

# Create environments in custom location
conda create --prefix /scratch/username/envs/myenv python=3.11

Activate prefix-based environment:

conda activate /scratch/username/envs/myenv

Using Conda in Job Scripts

Example SLURM job script:

#!/bin/bash
#SBATCH --job-name=my_conda_job
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G

# Initialize Conda
source ~/miniconda3/etc/profile.d/conda.sh

# Activate environment
conda activate myenv

# Run your Python script
python my_script.py

# Deactivate when done
conda deactivate

Parallel Package Installation

Speed up package installation with multiple threads:

conda install numpy --channel conda-forge --threads 4

Offline Package Installation

Download packages on login node, install on compute nodes:

# Download packages
conda install --download-only numpy

# Later, install without downloading
conda install --offline numpy

10. Conda Configuration

View Current Configuration

conda config --show

Common Configuration Options

Disable auto-activation of base environment:

conda config --set auto_activate_base false

Set default channels:

conda config --add channels conda-forge
conda config --set channel_priority strict

Change package cache location:

conda config --add pkgs_dirs /scratch/username/conda-cache

Increase solver timeout for complex environments:

conda config --set solver_timeout 300

Configuration File

Conda settings are stored in ~/.condarc:

channels:
  - conda-forge
  - defaults
channel_priority: strict
auto_activate_base: false
show_channel_urls: true

11. Troubleshooting

Solving Environment Issues

If environment solving is slow, try:

# Use libmamba solver (faster)
conda install -n base conda-libmamba-solver
conda config --set solver libmamba

Dependency Conflicts

If you encounter conflicts:

# Create new environment instead of updating
conda create --name myenv_new --clone myenv
conda activate myenv_new
conda install problematic-package

Corrupted Environment

Remove and recreate:

conda deactivate
conda remove --name myenv --all
conda env create -f myenv.yml

Package Not Found

Try different channels:

conda search package-name -c conda-forge
conda search package-name -c defaults

Check on Anaconda.org:

# Example URL
https://anaconda.org/search?q=package-name

Clearing Cache

If installations fail due to corrupted cache:

conda clean --all

Verification

Verify Conda installation:

conda info
conda list

Check for issues:

conda doctor

12. Cleaning Up Conda Data

Conda stores cached packages which may consume significant space.

Clean Commands

Clean package cache:

conda clean --packages

Clean tarballs:

conda clean --tarballs

Clean index cache:

conda clean --index-cache

Clean everything:

conda clean --all

Dry run (see what would be removed):

conda clean --all --dry-run

Checking Disk Usage

Check cache size:

du -sh ~/miniconda3/pkgs

Check environment sizes:

du -sh ~/miniconda3/envs/*

Be Careful

conda clean --all permanently removes cached package files.
Only run this command if you are sure you no longer need them.


13. Advanced Tips

Creating Minimal Environments

Install only what you need:

conda create --name minimal python=3.11 --no-default-packages

Using Environment Variables

Set environment variables for an environment:

conda env config vars set MY_VAR=value
conda activate myenv  # environment variables now set

View environment variables:

conda env config vars list

Building Packages

Create your own Conda package:

conda install conda-build
conda skeleton pypi package-name
conda build package-name

Conda Run Command

Run command in environment without activation:

conda run -n myenv python script.py

Useful for automation and scripts.

Package Pinning

Prevent package updates by pinning:

Create ~/miniconda3/envs/myenv/conda-meta/pinned:

numpy ==1.24.0
pandas >=2.0,<2.1

14. Common Package Installation Examples

Scientific Computing

conda create -n science python=3.11 numpy scipy matplotlib pandas scikit-learn jupyter

Machine Learning (CPU)

conda create -n ml-cpu python=3.11 scikit-learn xgboost lightgbm pandas matplotlib seaborn jupyter

Deep Learning (PyTorch with CUDA)

conda create -n pytorch python=3.11
conda activate pytorch
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Deep Learning (TensorFlow with CUDA)

conda create -n tensorflow python=3.11
conda activate tensorflow
conda install -c conda-forge cudatoolkit=11.8 cudnn=8.6
pip install tensorflow[and-cuda]

Bioinformatics

conda create -n bioinfo python=3.11 -c bioconda
conda activate bioinfo
conda install -c bioconda biopython samtools bcftools

Data Visualization

conda create -n dataviz python=3.11 matplotlib seaborn plotly bokeh altair -c conda-forge

Web Development

conda create -n web python=3.11 flask django requests beautifulsoup4 selenium

15. Summary

  • Use Miniconda, not Anaconda, on PERUN
  • Create separate environments for every project
  • Use conda-forge channel for consistent package updates
  • Install Conda packages first, then pip packages if needed
  • Always specify Python version when creating environments
  • Export environments as YAML files for reproducibility
  • Clean environment caches regularly if storage quota is limited
  • Use prefix-based environments if home directory quota is exceeded
  • Initialize Conda properly in job scripts
  • Use conda search to find packages before installing

Quick Reference Commands

# Create environment
conda create -n myenv python=3.11

# Activate/deactivate
conda activate myenv
conda deactivate

# Install packages
conda install numpy pandas -c conda-forge

# List packages
conda list

# Export environment
conda env export > environment.yml

# Clean cache
conda clean --all

# Update Conda
conda update -n base conda

Additional Resources