Configuration#
Configuration Overview#
Bio_pype uses a flexible configuration system that allows you to: - Customize module locations - Set system-specific parameters - Define execution environments - Manage resource limits
Module Paths#
By default, Bio_pype modules (snippets, pipelines, profiles, and queues) are installed in Python’s site-packages directory. However, you can customize these locations to: - Make modules easier to edit and maintain - Switch between different module sets (e.g., stable vs. development) - Share modules across users or projects
Configuration Methods#
1. Local Configuration File#
The primary configuration file is located at ~/.bio_pype/config. Example:
PYPE_TMP=/tmp
PYPE_LOGDIR=/tmp/logs
PYPE_SNIPPETS=~/bio_pype/snippets
PYPE_PIPELINES=~/bio_pype/pipelines
2. Environment Variables#
Environment variables override settings in the configuration file:
export PYPE_SNIPPETS=/custom/path/snippets
export PYPE_MEM="32GB"
Configuration Variables#
Variable |
Description |
|---|---|
PYPE_MODULES |
Base path for all module types — sets snippets, pipelines, profiles, and queues subdirectories at once |
PYPE_SNIPPETS |
Path to snippet modules (overridden by PYPE_MODULES) |
PYPE_PROFILES |
Path to profile configurations (overridden by PYPE_MODULES) |
PYPE_PIPELINES |
Path to pipeline definitions (overridden by PYPE_MODULES) |
PYPE_QUEUES |
Path to queue system adapters (overridden by PYPE_MODULES) |
PYPE_HOME |
Base directory for Bio_pype state — config file, logs, caches (default: ~/.bio_pype) |
PYPE_REGISTRY |
Registry git URL or local path (default: https://codeberg.org/bio-pype/workflows-registry.git) |
PYPE_NCPU |
Maximum CPUs for parallel local execution |
PYPE_GPU |
Number of GPUs available for local execution |
PYPE_NPU |
Number of NPUs available for local execution |
PYPE_MAX_JOBS_IN_QUEUE |
Maximum number of jobs to keep in the queue at once |
PYPE_MEM |
Maximum memory for local execution |
PYPE_TMP |
Temporary directory (available as %(pype_tmp)s in snippets; default: /tmp) |
PYPE_LOGDIR |
Log file directory (default: ~/.bio_pype/logs) |
PYPE_DOCKER |
Container runtime executable (default: docker) |
PYPE_CONDA |
Conda executable path (default: conda) |
PYPE_SINGULARITY_CACHE |
Singularity image cache directory (default: current working directory) |
PYPE_PULL_TIMEOUT |
Timeout in seconds for container/conda pulls during profile build (default: 3600) |
PYPE_EDITOR |
Editor for |
PYPE_MONITOR_INTERVAL |
Resource-monitor sampling interval in seconds (default: 1.0) |
PYPE_MONITOR_FLUSH_INTERVAL |
Resource-monitor sample flush interval in seconds (default: 5.0) |
PYPE_QUEUE_POLL_INTERVAL |
Seconds between queue poll cycles (default: 10) |
PYPE_QUEUE_ACCOUNT |
Default account/project submitted to the queue scheduler |
PYPE_QUEUE_PARTITION |
Default partition/queue name for job submission |
PYPE_QUEUE_PARTITIONS_CONFIG |
Path to a partition configuration file used for resource matching |
COMPUTE_BIO_API_URL |
compute.bio API endpoint for web monitoring (default: https://app.compute.bio/api/v1) |
COMPUTE_BIO_TOKEN |
API authentication token for compute.bio (optional; leave unset to disable API integration) |
Storage and execution mode#
These variables select the storage backend that moves data in and out of a snippet execution. See Storage Backends for full details.
Variable |
Description |
|---|---|
PYPE_OVERLAY_MODE |
Storage backend: |
PYPE_OVERLAY_SCRATCH |
Scratch directory for the |
PYPE_SNAPSHOT_REGISTRY |
Path to the snapshot→path registry JSON used by cloud backends |
Energy and carbon tracking#
These variables enable and tune CO2eq/energy estimation. See Energy and Carbon Tracking for the full guide.
Variable |
Description |
|---|---|
PYPE_CARBON_COUNTRY |
Electricity region/zone (e.g. DK, DE, FR); setting it enables tracking |
PYPE_CO2EQ_SRC |
Carbon-intensity provider: |
ENTSOE_API_KEY |
API token for the |
ELECTRICITY_MAPS_API_KEY |
API token for the |
PYPE_CARBON_FALLBACK_G_PER_KWH |
Static carbon intensity used when no provider value is available (default: 300.0) |
PYPE_CARBON_CPU_TDP_W |
CPU TDP for the power model when idle/loaded watts are unknown (default: 100.0) |
PYPE_POWER_IDLE_W |
Measured node idle power in watts (optional) |
PYPE_POWER_LOADED_W |
Measured node full-load power in watts (optional) |
Compute.bio API Integration#
Bio_pype can optionally integrate with compute.bio for web-based pipeline monitoring and control.
Setup#
To enable compute.bio integration:
Obtain an API token from your compute.bio account
Add configuration to
~/.bio_pype/config:COMPUTE_BIO_API_URL=https://app.compute.bio/api/v1 COMPUTE_BIO_TOKEN=your_api_token_here
Or set environment variables:
export COMPUTE_BIO_API_URL=https://app.compute.bio/api/v1
export COMPUTE_BIO_TOKEN=your_api_token_here
Features#
When configured, Bio_pype automatically:
Registers workers: Each pipeline run registers with a unique worker ID
Sends progress updates: Pipeline status sent every 30 seconds (default)
Tracks jobs: Job status, queue IDs, and timestamps synced to API
Supports log streaming: Real-time log viewing through web interface
Receives commands: API can request logs and other information
Configuration Options#
Fine-tune API integration with environment variables:
Variable |
Description |
|---|---|
PYPE_API_PROGRESS_INTERVAL |
Seconds between progress updates (default: 30) |
PYPE_API_COMMAND_INTERVAL |
Seconds between command polls (default: 60) |
Example:
export PYPE_API_PROGRESS_INTERVAL=60
export PYPE_API_COMMAND_INTERVAL=120
Verification#
To verify API integration is working:
Run a pipeline with API configured
Check logs for registration message:
INFO: Worker registered: worker-hostname-abc123 (Pipeline ID: 12345) INFO: Started progress watcher (updates every 30s) INFO: Started command watcher (polls every 60s)
View pipeline progress on compute.bio web interface
Disabling API Integration#
API integration is disabled by default. If you want to ensure it’s disabled:
Don’t set
COMPUTE_BIO_API_URLorCOMPUTE_BIO_TOKENOr remove them from
~/.bio_pype/config
If API credentials are not configured, pipelines run normally without web monitoring:
WARNING: compute.bio API not configured. Pipeline progress will not be sent to API.
CLI Commands#
Bio_pype provides CLI commands for testing and running the compute.bio integration.
Test API connectivity:
pype compute_bio --test
Tests the API connection without creating any records. Verifies that
COMPUTE_BIO_API_URL and COMPUTE_BIO_TOKEN are set correctly.
Run listener daemon:
pype compute_bio --run
# With custom log directory
pype compute_bio --run --log /path/to/logs
Starts a persistent daemon that monitors compute.bio for commands sent to pipelines running on this machine. The daemon:
Polls for commands every 10 seconds
Processes commands for inactive workers (pipelines that finished or crashed)
Handles log requests, job cancellation, and other remote commands
Uses a lock file to prevent multiple daemons on the same machine
Daemon usage notes:
Run in the background:
nohup pype compute_bio --run &Only one daemon should run per machine
The daemon handles commands for all pipelines registered from this host
Press Ctrl+C to stop the daemon gracefully