Using Python Virtual Environments on the PERUN Supercomputer¶
Introduction¶
A Python virtual environment is an isolated workspace that allows you to install project‑specific dependencies without affecting the system Python installation.
This improves reproducibility, stability, and prevents version conflicts between projects.
Virtual environments are lightweight compared to Conda and suitable for most workflows on the PERUN supercomputer.
What is a Virtual Environment?
A virtual environment keeps Python packages isolated so different projects do not interfere with each other.
1. Benefits of Virtual Environments¶
- Isolated dependencies — packages installed in one environment do not affect others.
- Reproducibility — easy to share with collaborators using
requirements.txt. - No admin rights required — everything is installed inside your home directory.
- Lightweight — faster to create and smaller disk footprint than Conda environments.
- Standard Python tool — built into Python 3.3+, no additional installation needed.
Recommendation
Create a separate virtual environment for each project or experiment.
2. Creating and Using Virtual Environments (venv)¶
The venv module is built into Python and requires no additional installation.
2.1 Create a Virtual Environment¶
Basic creation:
Common patterns:
# In your home directory
python3 -m venv ~/envs/myproject
# In project directory
python3 -m venv ./venv
# With specific Python version
python3.11 -m venv ~/envs/py311_project
Create with system site packages accessible:
System Site Packages
Using --system-site-packages can lead to conflicts. Only use if you specifically need system packages.
2.2 Activate Environment¶
On Linux/PERUN:
After activation, your prompt changes:
2.3 Deactivate Environment¶
2.4 Delete Environment¶
Simply remove the directory:
Example Workflow
3. Installing Packages Inside the Environment¶
Once activated, packages install only into that environment.
3.1 Basic Installation¶
Install single package:
Install multiple packages:
3.2 Version-Specific Installation¶
Install specific version:
Install minimum version:
Install version range:
Install latest compatible with specifier:
3.3 Installing from Requirements File¶
3.4 Installing from Git Repository¶
Install specific branch or tag:
pip install git+https://github.com/user/repo.git@branch-name
pip install git+https://github.com/user/repo.git@v1.0.0
3.5 Installing in Editable Mode¶
For local development:
Editable Mode
Changes to source code are immediately reflected without reinstalling.
3.6 Upgrading Packages¶
Upgrade single package:
Upgrade all packages:
Upgrade pip itself:
Important
Do not use sudo pip install — all installations must stay inside your user environment.
4. Managing Installed Packages¶
4.1 List Installed Packages¶
List all packages:
List in requirements format:
Show package details:
4.2 Search for Packages¶
Search PyPI:
Note
As of 2021, pip search is disabled on PyPI. Use pypi.org to search instead.
4.3 Uninstall Packages¶
Remove single package:
Remove multiple packages:
Uninstall without confirmation:
4.4 Check Package Dependencies¶
Show dependency tree:
4.5 Verify Installation Integrity¶
This checks for broken dependencies.
5. Working with Requirements Files¶
5.1 Creating Requirements Files¶
Simple freeze (all packages):
Only top-level packages:
Example requirements.in:
With version constraints:
cat > requirements.txt << EOF
numpy>=1.20,<1.25
pandas>=2.0
matplotlib>=3.5
scikit-learn>=1.3
tensorflow==2.13.0
EOF
5.2 Installing from Requirements¶
Install all packages:
5.3 Multiple Requirements Files¶
Example structure:
requirements-base.txt # Core dependencies
requirements-dev.txt # Development tools
requirements-test.txt # Testing dependencies
requirements-docs.txt # Documentation tools
5.4 Requirements File Best Practices¶
# requirements.txt
# Core scientific computing
numpy==1.24.3
pandas==2.0.3
scipy==1.11.1
# Machine learning
scikit-learn==1.3.0
xgboost==1.7.6
# Visualization
matplotlib==3.7.2
seaborn==0.12.2
# Utilities
tqdm==4.65.0
6. Using Virtual Environments in SLURM Batch Scripts¶
To use the environment in PERUN batch jobs:
6.1 Basic Job Script¶
#!/bin/bash
#SBATCH --job-name=python_job
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
# Activate virtual environment
source ~/envs/myenv/bin/activate
# Run Python script
python script.py
# Deactivate (optional, job ends anyway)
deactivate
6.2 With Error Handling¶
#!/bin/bash
#SBATCH --job-name=python_job
#SBATCH --output=job_%j.out
#SBATCH --error=job_%j.err
# Exit on error
set -e
# Activate virtual environment
if [ -f ~/envs/myenv/bin/activate ]; then
source ~/envs/myenv/bin/activate
else
echo "Error: Virtual environment not found!"
exit 1
fi
# Verify Python version
python --version
# Run script
python script.py
# Check exit status
if [ $? -eq 0 ]; then
echo "Script completed successfully"
else
echo "Script failed with error code $?"
exit 1
fi
6.3 Array Jobs with Virtual Environment¶
#!/bin/bash
#SBATCH --job-name=array_job
#SBATCH --array=1-10
#SBATCH --time=01:00:00
source ~/envs/myenv/bin/activate
# Use SLURM_ARRAY_TASK_ID
python process.py --input data_${SLURM_ARRAY_TASK_ID}.txt
Common Error
Make sure the environment was created using the same Python version that you use to run your script.
Mixing versions may cause missing-module or compatibility errors.
7. Advanced Package Installation¶
7.1 Installing from Wheels¶
7.2 Installing with Extras¶
Example:
7.3 Installing Pre-release Versions¶
7.4 Using Alternative Indexes¶
7.5 Installing from Local Directory¶
7.6 No Cache Installation¶
Useful when disk space is limited.
7.7 User Installation (without venv)¶
If you cannot create a venv:
Packages install to ~/.local/
Not Recommended
User installations are not isolated. Prefer virtual environments.
8. Troubleshooting Virtual Environments¶
8.1 Activation Issues¶
Problem: Activation script not found
Solution: Verify environment exists:
Recreate if missing:
Problem: Wrong Python version
Solution: Deactivate and reactivate:
8.2 Package Installation Errors¶
Problem: Permission denied
Solution: Never use sudo. Ensure environment is activated:
# Check if environment is active
echo $VIRTUAL_ENV
# If empty, activate first
source ~/envs/myenv/bin/activate
Problem: Compilation errors
Solution: Install development packages (if available) or use binary wheels:
Problem: Disk quota exceeded
Solution: Clean pip cache:
Or install without cache:
8.3 Dependency Conflicts¶
Problem: Incompatible versions
Solution: Use constraint files:
Or create fresh environment:
python3 -m venv ~/envs/clean_env
source ~/envs/clean_env/bin/activate
pip install -r requirements.txt
8.4 Module Not Found Errors¶
Problem: Import fails despite installation
Solutions:
-
Verify environment is activated:
-
Check if package is installed:
-
Install if missing:
-
Check for typos in import statement
8.5 Corrupted Environment¶
Solution: Delete and recreate:
rm -rf ~/envs/myenv
python3 -m venv ~/envs/myenv
source ~/envs/myenv/bin/activate
pip install -r requirements.txt
Never modify system Python
Do not install Python packages globally on PERUN — use virtual environments only.
9. Using virtualenvwrapper (Optional)¶
virtualenvwrapper simplifies managing multiple Python virtual environments.
9.1 Installation¶
9.2 Setup¶
Add to your ~/.bashrc:
export WORKON_HOME=$HOME/envs
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source $HOME/.local/bin/virtualenvwrapper.sh
Reload your shell:
What is WORKON_HOME?
This directory stores all virtual environments created with virtualenvwrapper.
10. Basic virtualenvwrapper Commands¶
Create a new environment¶
Create with specific Python version:
List all environments¶
Activate environment¶
Deactivate¶
Remove an environment¶
Copy an environment¶
Show environment details¶
Quick Navigation
Use cdvirtualenv to jump to the active environment directory.
Use cdsitepackages to jump to site-packages directory.
11. Hook Scripts (Advanced)¶
Each environment includes hook scripts located in:
You can customize actions such as printing messages or loading configs.
Available Hooks¶
preactivate- Runs before activationpostactivate- Runs after activationpredeactivate- Runs before deactivationpostdeactivate- Runs after deactivationpremkvirtualenv- Runs before creating environmentpostmkvirtualenv- Runs after creating environment
Example: Add activation messages¶
Example: Auto-load environment variables¶
# In postactivate
echo 'export DATABASE_URL=postgresql://localhost/mydb' >> $WORKON_HOME/myenv/bin/postactivate
echo 'export DEBUG=True' >> $WORKON_HOME/myenv/bin/postactivate
Example: Auto-change directory¶
12. Saving and Restoring Dependencies¶
Export dependencies:¶
Export with hashes (more secure):¶
Reinstall dependencies:¶
Install with hash verification:¶
Best Practice
Always store requirements.txt in your project folder for future reproducibility.
13. Virtual Environments vs Conda¶
| Feature | venv/virtualenv | Conda |
|---|---|---|
| Installation | Built into Python | Requires separate installation |
| Size | Small (~10-50 MB) | Large (~500 MB - several GB) |
| Creation Speed | Fast (seconds) | Slower (minutes) |
| Python Versions | Single version per env | Multiple versions per env |
| Non-Python Packages | No | Yes (C, R, etc.) |
| Package Sources | PyPI only | Multiple channels |
| Dependency Resolution | Simple | Advanced solver |
| Best For | Python-only projects | Multi-language projects |
When to Use venv¶
- Pure Python projects
- Quick prototyping
- Limited disk space
- Simple dependencies
- Standard Python packages available on PyPI
When to Use Conda¶
- Need specific Python version per project
- Mix of Python and non-Python packages (CUDA, compilers)
- Complex scientific computing environments
- Bioinformatics workflows
- Need reproducibility across different systems
14. HPC-Specific Best Practices¶
14.1 Environment Location¶
Recommended locations:
# Home directory (default)
~/envs/myenv
# Project directory
/scratch/username/projects/myproject/venv
# Shared project space
/projects/groupname/envs/shared_env
14.2 Storage Management¶
Check environment size:
Clean pip cache:
14.3 Module System Integration¶
Load specific Python version:
Include in job script:
#!/bin/bash
#SBATCH --job-name=python_job
# Load Python module
module load python/3.11
# Activate environment
source ~/envs/py311_env/bin/activate
# Run script
python script.py
14.4 Shared Environments¶
Create read-only shared environment:
# As group admin
python3 -m venv /projects/groupname/envs/shared_env
source /projects/groupname/envs/shared_env/bin/activate
pip install numpy pandas scipy matplotlib
chmod -R a-w /projects/groupname/envs/shared_env
Users activate read-only:
14.5 Reproducibility Script¶
Create environment setup script:
#!/bin/bash
# setup_env.sh
ENV_NAME="myproject"
ENV_PATH="$HOME/envs/$ENV_NAME"
# Create environment
python3 -m venv $ENV_PATH
# Activate
source $ENV_PATH/bin/activate
# Upgrade pip
pip install --upgrade pip
# Install dependencies
pip install -r requirements.txt
echo "Environment $ENV_NAME created successfully!"
echo "To activate: source $ENV_PATH/bin/activate"
15. Common Package Installation Examples¶
Scientific Computing¶
python3 -m venv ~/envs/science
source ~/envs/science/bin/activate
pip install numpy scipy matplotlib pandas jupyter scikit-learn
Machine Learning¶
python3 -m venv ~/envs/ml
source ~/envs/ml/bin/activate
pip install scikit-learn xgboost lightgbm pandas numpy matplotlib seaborn jupyter
Deep Learning (PyTorch)¶
python3 -m venv ~/envs/pytorch
source ~/envs/pytorch/bin/activate
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install transformers datasets accelerate
Deep Learning (TensorFlow)¶
python3 -m venv ~/envs/tensorflow
source ~/envs/tensorflow/bin/activate
pip install tensorflow[and-cuda]
pip install keras tensorboard
Data Science¶
python3 -m venv ~/envs/datascience
source ~/envs/datascience/bin/activate
pip install pandas numpy scipy matplotlib seaborn plotly jupyter jupyterlab
pip install scikit-learn statsmodels
Web Scraping¶
python3 -m venv ~/envs/scraping
source ~/envs/scraping/bin/activate
pip install requests beautifulsoup4 selenium lxml scrapy
Natural Language Processing¶
python3 -m venv ~/envs/nlp
source ~/envs/nlp/bin/activate
pip install transformers datasets tokenizers spacy nltk gensim
python -m spacy download en_core_web_sm
Computer Vision¶
python3 -m venv ~/envs/cv
source ~/envs/cv/bin/activate
pip install opencv-python pillow scikit-image matplotlib
16. Automation and Scripts¶
16.1 Automatic Activation Script¶
Create activate_env.sh:
#!/bin/bash
# Automatically activate environment when entering project directory
VENV_PATH="$HOME/projects/myproject/venv"
if [ -f "$VENV_PATH/bin/activate" ]; then
source "$VENV_PATH/bin/activate"
echo "Virtual environment activated: $VIRTUAL_ENV"
else
echo "Warning: Virtual environment not found at $VENV_PATH"
fi
16.2 Environment Status Check¶
#!/bin/bash
# check_env.sh - Verify environment setup
if [ -z "$VIRTUAL_ENV" ]; then
echo "❌ No virtual environment active"
exit 1
else
echo "✓ Virtual environment: $VIRTUAL_ENV"
fi
echo "✓ Python: $(python --version)"
echo "✓ Pip: $(pip --version)"
echo "✓ Installed packages:"
pip list --format=freeze | head -n 5
16.3 Bulk Environment Creation¶
#!/bin/bash
# create_multiple_envs.sh
PROJECTS=("project1" "project2" "project3")
for PROJECT in "${PROJECTS[@]}"; do
echo "Creating environment for $PROJECT..."
python3 -m venv ~/envs/$PROJECT
source ~/envs/$PROJECT/bin/activate
pip install --upgrade pip
if [ -f "~/projects/$PROJECT/requirements.txt" ]; then
pip install -r ~/projects/$PROJECT/requirements.txt
fi
deactivate
done
17. Tips and Best Practices¶
✓ Do's¶
- Create one environment per project
- Always activate before installing packages
- Keep
requirements.txtupdated - Use version pinning for reproducibility
- Document Python version used
- Clean pip cache regularly
- Test environment recreation from requirements.txt
✗ Don'ts¶
- Don't install packages globally with
sudo pip - Don't commit venv directory to git
- Don't mix pip and system package managers
- Don't use
--system-site-packagesunless necessary - Don't share environments between incompatible Python versions
- Don't ignore pip warnings about conflicts
Version Control¶
Add to .gitignore:
# Virtual environments
venv/
env/
ENV/
.venv/
# Python cache
__pycache__/
*.pyc
*.pyo
*.pyd
# Pip
pip-log.txt
pip-delete-this-directory.txt
Commit to git:
18. Quick Reference¶
Common Commands¶
# Create environment
python3 -m venv ~/envs/myenv
# Activate
source ~/envs/myenv/bin/activate
# Install packages
pip install package-name
# Save dependencies
pip freeze > requirements.txt
# Install from requirements
pip install -r requirements.txt
# List packages
pip list
# Upgrade package
pip install --upgrade package-name
# Uninstall package
pip uninstall package-name
# Deactivate
deactivate
# Delete environment
rm -rf ~/envs/myenv
virtualenvwrapper Commands¶
# Create
mkvirtualenv myenv
# Activate
workon myenv
# List all
lsvirtualenv
# Remove
rmvirtualenv myenv
# Copy
cpvirtualenv old_env new_env
19. Summary¶
- Virtual environments provide isolated Python package installations
- Use
venvfor most Python projects on PERUN - Always activate environment before installing packages
- Use
requirements.txtfor reproducibility - Integrate environments into SLURM job scripts
- Clean pip cache if disk space is limited
- Consider virtualenvwrapper for managing multiple environments
- Choose between venv and Conda based on project needs
Getting Started Template
# 1. Create environment
python3 -m venv ~/envs/myproject
# 2. Activate
source ~/envs/myproject/bin/activate
# 3. Upgrade pip
pip install --upgrade pip
# 4. Install packages
pip install numpy pandas matplotlib
# 5. Save dependencies
pip freeze > requirements.txt
# 6. Work on project
python script.py
# 7. Deactivate when done
deactivate
Additional Resources¶
- Python venv Documentation
- pip Documentation
- virtualenvwrapper Documentation
- Python Packaging Guide
- PyPI - Python Package Index