Skip to content

Using Python Virtual Environments on the PERUN Supercomputer

Introduction

A Python virtual environment is an isolated workspace that allows you to install project‑specific dependencies without affecting the system Python installation.
This improves reproducibility, stability, and prevents version conflicts between projects.

Virtual environments are lightweight compared to Conda and suitable for most workflows on the PERUN supercomputer.

What is a Virtual Environment?

A virtual environment keeps Python packages isolated so different projects do not interfere with each other.


1. Benefits of Virtual Environments

  • Isolated dependencies — packages installed in one environment do not affect others.
  • Reproducibility — easy to share with collaborators using requirements.txt.
  • No admin rights required — everything is installed inside your home directory.
  • Lightweight — faster to create and smaller disk footprint than Conda environments.
  • Standard Python tool — built into Python 3.3+, no additional installation needed.

Recommendation

Create a separate virtual environment for each project or experiment.


2. Creating and Using Virtual Environments (venv)

The venv module is built into Python and requires no additional installation.

2.1 Create a Virtual Environment

Basic creation:

python3 -m venv /path/to/env_directory

Common patterns:

# In your home directory
python3 -m venv ~/envs/myproject

# In project directory
python3 -m venv ./venv

# With specific Python version
python3.11 -m venv ~/envs/py311_project

Create with system site packages accessible:

python3 -m venv --system-site-packages ~/envs/myenv

System Site Packages

Using --system-site-packages can lead to conflicts. Only use if you specifically need system packages.

2.2 Activate Environment

On Linux/PERUN:

source /path/to/env_directory/bin/activate

After activation, your prompt changes:

(myenv) user@perun:~$

2.3 Deactivate Environment

deactivate

2.4 Delete Environment

Simply remove the directory:

rm -rf /path/to/env_directory

Example Workflow

# Create environment
python3 -m venv ~/envs/myenv

# Activate
source ~/envs/myenv/bin/activate

# Install packages
pip install numpy pandas matplotlib

# Work on your project
python my_script.py

# Deactivate when done
deactivate

3. Installing Packages Inside the Environment

Once activated, packages install only into that environment.

3.1 Basic Installation

Install single package:

pip install <package-name>

Install multiple packages:

pip install numpy pandas matplotlib scikit-learn

3.2 Version-Specific Installation

Install specific version:

pip install numpy==1.24.0

Install minimum version:

pip install "numpy>=1.20"

Install version range:

pip install "numpy>=1.20,<1.25"

Install latest compatible with specifier:

pip install "numpy~=1.24.0"  # allows 1.24.x

3.3 Installing from Requirements File

pip install -r requirements.txt

3.4 Installing from Git Repository

pip install git+https://github.com/user/repo.git

Install specific branch or tag:

pip install git+https://github.com/user/repo.git@branch-name
pip install git+https://github.com/user/repo.git@v1.0.0

3.5 Installing in Editable Mode

For local development:

pip install -e /path/to/package

Editable Mode

Changes to source code are immediately reflected without reinstalling.

3.6 Upgrading Packages

Upgrade single package:

pip install --upgrade numpy
pip install -U numpy  # short form

Upgrade all packages:

pip list --outdated
pip install --upgrade package1 package2 package3

Upgrade pip itself:

pip install --upgrade pip

Important

Do not use sudo pip install — all installations must stay inside your user environment.


4. Managing Installed Packages

4.1 List Installed Packages

List all packages:

pip list

List in requirements format:

pip freeze

Show package details:

pip show numpy

4.2 Search for Packages

Search PyPI:

pip search package-name

Note

As of 2021, pip search is disabled on PyPI. Use pypi.org to search instead.

4.3 Uninstall Packages

Remove single package:

pip uninstall package-name

Remove multiple packages:

pip uninstall package1 package2 package3

Uninstall without confirmation:

pip uninstall -y package-name

4.4 Check Package Dependencies

pip show package-name

Show dependency tree:

pip install pipdeptree
pipdeptree

4.5 Verify Installation Integrity

pip check

This checks for broken dependencies.


5. Working with Requirements Files

5.1 Creating Requirements Files

Simple freeze (all packages):

pip freeze > requirements.txt

Only top-level packages:

pip install pip-tools
pip-compile requirements.in

Example requirements.in:

numpy
pandas
matplotlib
scikit-learn

With version constraints:

cat > requirements.txt << EOF
numpy>=1.20,<1.25
pandas>=2.0
matplotlib>=3.5
scikit-learn>=1.3
tensorflow==2.13.0
EOF

5.2 Installing from Requirements

Install all packages:

pip install -r requirements.txt

5.3 Multiple Requirements Files

pip install -r requirements-base.txt -r requirements-dev.txt

Example structure:

requirements-base.txt    # Core dependencies
requirements-dev.txt     # Development tools
requirements-test.txt    # Testing dependencies
requirements-docs.txt    # Documentation tools

5.4 Requirements File Best Practices

# requirements.txt

# Core scientific computing
numpy==1.24.3
pandas==2.0.3
scipy==1.11.1

# Machine learning
scikit-learn==1.3.0
xgboost==1.7.6

# Visualization
matplotlib==3.7.2
seaborn==0.12.2

# Utilities
tqdm==4.65.0

6. Using Virtual Environments in SLURM Batch Scripts

To use the environment in PERUN batch jobs:

6.1 Basic Job Script

#!/bin/bash
#SBATCH --job-name=python_job
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G

# Activate virtual environment
source ~/envs/myenv/bin/activate

# Run Python script
python script.py

# Deactivate (optional, job ends anyway)
deactivate

6.2 With Error Handling

#!/bin/bash
#SBATCH --job-name=python_job
#SBATCH --output=job_%j.out
#SBATCH --error=job_%j.err

# Exit on error
set -e

# Activate virtual environment
if [ -f ~/envs/myenv/bin/activate ]; then
    source ~/envs/myenv/bin/activate
else
    echo "Error: Virtual environment not found!"
    exit 1
fi

# Verify Python version
python --version

# Run script
python script.py

# Check exit status
if [ $? -eq 0 ]; then
    echo "Script completed successfully"
else
    echo "Script failed with error code $?"
    exit 1
fi

6.3 Array Jobs with Virtual Environment

#!/bin/bash
#SBATCH --job-name=array_job
#SBATCH --array=1-10
#SBATCH --time=01:00:00

source ~/envs/myenv/bin/activate

# Use SLURM_ARRAY_TASK_ID
python process.py --input data_${SLURM_ARRAY_TASK_ID}.txt

Common Error

Make sure the environment was created using the same Python version that you use to run your script.
Mixing versions may cause missing-module or compatibility errors.


7. Advanced Package Installation

7.1 Installing from Wheels

pip install package-1.0.0-py3-none-any.whl

7.2 Installing with Extras

pip install package-name[extra1,extra2]

Example:

pip install tensorflow[and-cuda]
pip install flask[async]

7.3 Installing Pre-release Versions

pip install --pre package-name

7.4 Using Alternative Indexes

pip install --index-url https://pypi.org/simple/ package-name

7.5 Installing from Local Directory

pip install /path/to/package/

7.6 No Cache Installation

pip install --no-cache-dir package-name

Useful when disk space is limited.

7.7 User Installation (without venv)

If you cannot create a venv:

pip install --user package-name

Packages install to ~/.local/

Not Recommended

User installations are not isolated. Prefer virtual environments.


8. Troubleshooting Virtual Environments

8.1 Activation Issues

Problem: Activation script not found

source ~/envs/myenv/bin/activate
# bash: activate: No such file or directory

Solution: Verify environment exists:

ls ~/envs/myenv/bin/activate

Recreate if missing:

python3 -m venv ~/envs/myenv

Problem: Wrong Python version

which python
# Shows system Python instead of venv Python

Solution: Deactivate and reactivate:

deactivate
source ~/envs/myenv/bin/activate
which python  # Should show venv path

8.2 Package Installation Errors

Problem: Permission denied

Solution: Never use sudo. Ensure environment is activated:

# Check if environment is active
echo $VIRTUAL_ENV

# If empty, activate first
source ~/envs/myenv/bin/activate

Problem: Compilation errors

Solution: Install development packages (if available) or use binary wheels:

pip install --only-binary :all: package-name

Problem: Disk quota exceeded

Solution: Clean pip cache:

pip cache purge

Or install without cache:

pip install --no-cache-dir package-name

8.3 Dependency Conflicts

Problem: Incompatible versions

pip check
# Displays conflicts

Solution: Use constraint files:

pip install -c constraints.txt -r requirements.txt

Or create fresh environment:

python3 -m venv ~/envs/clean_env
source ~/envs/clean_env/bin/activate
pip install -r requirements.txt

8.4 Module Not Found Errors

Problem: Import fails despite installation

ModuleNotFoundError: No module named 'numpy'

Solutions:

  1. Verify environment is activated:

    echo $VIRTUAL_ENV
    which python
    

  2. Check if package is installed:

    pip list | grep numpy
    

  3. Install if missing:

    pip install numpy
    

  4. Check for typos in import statement

8.5 Corrupted Environment

Solution: Delete and recreate:

rm -rf ~/envs/myenv
python3 -m venv ~/envs/myenv
source ~/envs/myenv/bin/activate
pip install -r requirements.txt

Never modify system Python

Do not install Python packages globally on PERUN — use virtual environments only.


9. Using virtualenvwrapper (Optional)

virtualenvwrapper simplifies managing multiple Python virtual environments.

9.1 Installation

pip install --user virtualenvwrapper

9.2 Setup

Add to your ~/.bashrc:

export WORKON_HOME=$HOME/envs
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source $HOME/.local/bin/virtualenvwrapper.sh

Reload your shell:

source ~/.bashrc

What is WORKON_HOME?

This directory stores all virtual environments created with virtualenvwrapper.


10. Basic virtualenvwrapper Commands

Create a new environment

mkvirtualenv myenv

Create with specific Python version:

mkvirtualenv -p python3.11 myenv

List all environments

lsvirtualenv

Activate environment

workon myenv

Deactivate

deactivate

Remove an environment

rmvirtualenv myenv

Copy an environment

cpvirtualenv source_env dest_env

Show environment details

showvirtualenv myenv

Quick Navigation

Use cdvirtualenv to jump to the active environment directory. Use cdsitepackages to jump to site-packages directory.


11. Hook Scripts (Advanced)

Each environment includes hook scripts located in:

$WORKON_HOME/myenv/bin/

You can customize actions such as printing messages or loading configs.

Available Hooks

  • preactivate - Runs before activation
  • postactivate - Runs after activation
  • predeactivate - Runs before deactivation
  • postdeactivate - Runs after deactivation
  • premkvirtualenv - Runs before creating environment
  • postmkvirtualenv - Runs after creating environment

Example: Add activation messages

echo 'echo "Environment activated: $VIRTUAL_ENV"' >> $WORKON_HOME/myenv/bin/postactivate

Example: Auto-load environment variables

# In postactivate
echo 'export DATABASE_URL=postgresql://localhost/mydb' >> $WORKON_HOME/myenv/bin/postactivate
echo 'export DEBUG=True' >> $WORKON_HOME/myenv/bin/postactivate

Example: Auto-change directory

echo 'cd ~/projects/myproject' >> $WORKON_HOME/myenv/bin/postactivate

12. Saving and Restoring Dependencies

Export dependencies:

pip freeze > requirements.txt

Export with hashes (more secure):

pip freeze --all > requirements.txt

Reinstall dependencies:

pip install -r requirements.txt

Install with hash verification:

pip install --require-hashes -r requirements.txt

Best Practice

Always store requirements.txt in your project folder for future reproducibility.


13. Virtual Environments vs Conda

Feature venv/virtualenv Conda
Installation Built into Python Requires separate installation
Size Small (~10-50 MB) Large (~500 MB - several GB)
Creation Speed Fast (seconds) Slower (minutes)
Python Versions Single version per env Multiple versions per env
Non-Python Packages No Yes (C, R, etc.)
Package Sources PyPI only Multiple channels
Dependency Resolution Simple Advanced solver
Best For Python-only projects Multi-language projects

When to Use venv

  • Pure Python projects
  • Quick prototyping
  • Limited disk space
  • Simple dependencies
  • Standard Python packages available on PyPI

When to Use Conda

  • Need specific Python version per project
  • Mix of Python and non-Python packages (CUDA, compilers)
  • Complex scientific computing environments
  • Bioinformatics workflows
  • Need reproducibility across different systems

14. HPC-Specific Best Practices

14.1 Environment Location

Recommended locations:

# Home directory (default)
~/envs/myenv

# Project directory
/scratch/username/projects/myproject/venv

# Shared project space
/projects/groupname/envs/shared_env

14.2 Storage Management

Check environment size:

du -sh ~/envs/myenv

Clean pip cache:

pip cache dir  # Show cache location
pip cache purge  # Clean all cache

14.3 Module System Integration

Load specific Python version:

module load python/3.11
python3 -m venv ~/envs/py311_env

Include in job script:

#!/bin/bash
#SBATCH --job-name=python_job

# Load Python module
module load python/3.11

# Activate environment
source ~/envs/py311_env/bin/activate

# Run script
python script.py

14.4 Shared Environments

Create read-only shared environment:

# As group admin
python3 -m venv /projects/groupname/envs/shared_env
source /projects/groupname/envs/shared_env/bin/activate
pip install numpy pandas scipy matplotlib
chmod -R a-w /projects/groupname/envs/shared_env

Users activate read-only:

source /projects/groupname/envs/shared_env/bin/activate

14.5 Reproducibility Script

Create environment setup script:

#!/bin/bash
# setup_env.sh

ENV_NAME="myproject"
ENV_PATH="$HOME/envs/$ENV_NAME"

# Create environment
python3 -m venv $ENV_PATH

# Activate
source $ENV_PATH/bin/activate

# Upgrade pip
pip install --upgrade pip

# Install dependencies
pip install -r requirements.txt

echo "Environment $ENV_NAME created successfully!"
echo "To activate: source $ENV_PATH/bin/activate"

15. Common Package Installation Examples

Scientific Computing

python3 -m venv ~/envs/science
source ~/envs/science/bin/activate
pip install numpy scipy matplotlib pandas jupyter scikit-learn

Machine Learning

python3 -m venv ~/envs/ml
source ~/envs/ml/bin/activate
pip install scikit-learn xgboost lightgbm pandas numpy matplotlib seaborn jupyter

Deep Learning (PyTorch)

python3 -m venv ~/envs/pytorch
source ~/envs/pytorch/bin/activate
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install transformers datasets accelerate

Deep Learning (TensorFlow)

python3 -m venv ~/envs/tensorflow
source ~/envs/tensorflow/bin/activate
pip install tensorflow[and-cuda]
pip install keras tensorboard

Data Science

python3 -m venv ~/envs/datascience
source ~/envs/datascience/bin/activate
pip install pandas numpy scipy matplotlib seaborn plotly jupyter jupyterlab
pip install scikit-learn statsmodels

Web Scraping

python3 -m venv ~/envs/scraping
source ~/envs/scraping/bin/activate
pip install requests beautifulsoup4 selenium lxml scrapy

Natural Language Processing

python3 -m venv ~/envs/nlp
source ~/envs/nlp/bin/activate
pip install transformers datasets tokenizers spacy nltk gensim
python -m spacy download en_core_web_sm

Computer Vision

python3 -m venv ~/envs/cv
source ~/envs/cv/bin/activate
pip install opencv-python pillow scikit-image matplotlib

16. Automation and Scripts

16.1 Automatic Activation Script

Create activate_env.sh:

#!/bin/bash
# Automatically activate environment when entering project directory

VENV_PATH="$HOME/projects/myproject/venv"

if [ -f "$VENV_PATH/bin/activate" ]; then
    source "$VENV_PATH/bin/activate"
    echo "Virtual environment activated: $VIRTUAL_ENV"
else
    echo "Warning: Virtual environment not found at $VENV_PATH"
fi

16.2 Environment Status Check

#!/bin/bash
# check_env.sh - Verify environment setup

if [ -z "$VIRTUAL_ENV" ]; then
    echo "❌ No virtual environment active"
    exit 1
else
    echo "✓ Virtual environment: $VIRTUAL_ENV"
fi

echo "✓ Python: $(python --version)"
echo "✓ Pip: $(pip --version)"
echo "✓ Installed packages:"
pip list --format=freeze | head -n 5

16.3 Bulk Environment Creation

#!/bin/bash
# create_multiple_envs.sh

PROJECTS=("project1" "project2" "project3")

for PROJECT in "${PROJECTS[@]}"; do
    echo "Creating environment for $PROJECT..."
    python3 -m venv ~/envs/$PROJECT
    source ~/envs/$PROJECT/bin/activate
    pip install --upgrade pip

    if [ -f "~/projects/$PROJECT/requirements.txt" ]; then
        pip install -r ~/projects/$PROJECT/requirements.txt
    fi

    deactivate
done

17. Tips and Best Practices

✓ Do's

  • Create one environment per project
  • Always activate before installing packages
  • Keep requirements.txt updated
  • Use version pinning for reproducibility
  • Document Python version used
  • Clean pip cache regularly
  • Test environment recreation from requirements.txt

✗ Don'ts

  • Don't install packages globally with sudo pip
  • Don't commit venv directory to git
  • Don't mix pip and system package managers
  • Don't use --system-site-packages unless necessary
  • Don't share environments between incompatible Python versions
  • Don't ignore pip warnings about conflicts

Version Control

Add to .gitignore:

# Virtual environments
venv/
env/
ENV/
.venv/

# Python cache
__pycache__/
*.pyc
*.pyo
*.pyd

# Pip
pip-log.txt
pip-delete-this-directory.txt

Commit to git:

# DO commit
requirements.txt
setup.py
pyproject.toml

# DON'T commit
venv/

18. Quick Reference

Common Commands

# Create environment
python3 -m venv ~/envs/myenv

# Activate
source ~/envs/myenv/bin/activate

# Install packages
pip install package-name

# Save dependencies
pip freeze > requirements.txt

# Install from requirements
pip install -r requirements.txt

# List packages
pip list

# Upgrade package
pip install --upgrade package-name

# Uninstall package
pip uninstall package-name

# Deactivate
deactivate

# Delete environment
rm -rf ~/envs/myenv

virtualenvwrapper Commands

# Create
mkvirtualenv myenv

# Activate
workon myenv

# List all
lsvirtualenv

# Remove
rmvirtualenv myenv

# Copy
cpvirtualenv old_env new_env

19. Summary

  • Virtual environments provide isolated Python package installations
  • Use venv for most Python projects on PERUN
  • Always activate environment before installing packages
  • Use requirements.txt for reproducibility
  • Integrate environments into SLURM job scripts
  • Clean pip cache if disk space is limited
  • Consider virtualenvwrapper for managing multiple environments
  • Choose between venv and Conda based on project needs

Getting Started Template

# 1. Create environment
python3 -m venv ~/envs/myproject

# 2. Activate
source ~/envs/myproject/bin/activate

# 3. Upgrade pip
pip install --upgrade pip

# 4. Install packages
pip install numpy pandas matplotlib

# 5. Save dependencies
pip freeze > requirements.txt

# 6. Work on project
python script.py

# 7. Deactivate when done
deactivate

Additional Resources