.. index:: Configuration .. _configuration: Configuration ============= Configuration Overview ------------------- Bio_pype uses a flexible configuration system that allows you to: - Customize module locations - Set system-specific parameters - Define execution environments - Manage resource limits Module Paths ---------- By default, Bio_pype modules (snippets, pipelines, profiles, and queues) are installed in Python's site-packages directory. However, you can customize these locations to: - Make modules easier to edit and maintain - Switch between different module sets (e.g., stable vs. development) - Share modules across users or projects Configuration Methods ------------------ 1. Local Configuration File ^^^^^^^^^^^^^^^^^^^^^^^^^ The primary configuration file is located at ``~/.bio_pype/config``. Example:: PYPE_TMP=/tmp PYPE_LOGDIR=/tmp/logs PYPE_SNIPPETS=~/bio_pype/snippets PYPE_PIPELINES=~/bio_pype/pipelines 2. Environment Variables ^^^^^^^^^^^^^^^^^^^^^ Environment variables override settings in the configuration file:: export PYPE_SNIPPETS=/custom/path/snippets export PYPE_MEM="32GB" Configuration Variables -------------------- .. list-table:: :widths: 33 66 :header-rows: 1 * - Variable - Description * - PYPE_MODULES - Base path for all module types — sets snippets, pipelines, profiles, and queues subdirectories at once * - PYPE_SNIPPETS - Path to snippet modules (overridden by PYPE_MODULES) * - PYPE_PROFILES - Path to profile configurations (overridden by PYPE_MODULES) * - PYPE_PIPELINES - Path to pipeline definitions (overridden by PYPE_MODULES) * - PYPE_QUEUES - Path to queue system adapters (overridden by PYPE_MODULES) * - PYPE_HOME - Base directory for Bio_pype state — config file, logs, caches (default: ~/.bio_pype) * - PYPE_REGISTRY - Registry git URL or local path (default: https://codeberg.org/bio-pype/workflows-registry.git) * - PYPE_NCPU - Maximum CPUs for parallel local execution * - PYPE_GPU - Number of GPUs available for local execution * - PYPE_NPU - Number of NPUs available for local execution * - PYPE_MAX_JOBS_IN_QUEUE - Maximum number of jobs to keep in the queue at once * - PYPE_MEM - Maximum memory for local execution * - PYPE_TMP - Temporary directory (available as %(pype_tmp)s in snippets; default: /tmp) * - PYPE_LOGDIR - Log file directory (default: ~/.bio_pype/logs) * - PYPE_DOCKER - Container runtime executable (default: docker) * - PYPE_CONDA - Conda executable path (default: conda) * - PYPE_SINGULARITY_CACHE - Singularity image cache directory (default: current working directory) * - PYPE_PULL_TIMEOUT - Timeout in seconds for container/conda pulls during profile build (default: 3600) * - PYPE_EDITOR - Editor for ``pype profiles edit`` (default: $EDITOR or vi) * - PYPE_MONITOR_INTERVAL - Resource-monitor sampling interval in seconds (default: 1.0) * - PYPE_MONITOR_FLUSH_INTERVAL - Resource-monitor sample flush interval in seconds (default: 5.0) * - PYPE_QUEUE_POLL_INTERVAL - Seconds between queue poll cycles (default: 10) * - PYPE_QUEUE_ACCOUNT - Default account/project submitted to the queue scheduler * - PYPE_QUEUE_PARTITION - Default partition/queue name for job submission * - PYPE_QUEUE_PARTITIONS_CONFIG - Path to a partition configuration file used for resource matching * - COMPUTE_BIO_API_URL - compute.bio API endpoint for web monitoring (default: https://app.compute.bio/api/v1) * - COMPUTE_BIO_TOKEN - API authentication token for compute.bio (optional; leave unset to disable API integration) Storage and execution mode ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ These variables select the storage backend that moves data in and out of a snippet execution. See :ref:`storage_backends` for full details. .. list-table:: :widths: 33 66 :header-rows: 1 * - Variable - Description * - PYPE_OVERLAY_MODE - Storage backend: ``direct`` (default), ``overlay``, or a queue-module name (e.g. ``scaleway``) * - PYPE_OVERLAY_SCRATCH - Scratch directory for the ``overlay`` backend (default: /tmp/pype-overlay) * - PYPE_SNAPSHOT_REGISTRY - Path to the snapshot→path registry JSON used by cloud backends Energy and carbon tracking ^^^^^^^^^^^^^^^^^^^^^^^^^^^ These variables enable and tune CO2eq/energy estimation. See :ref:`carbon_tracking` for the full guide. .. list-table:: :widths: 33 66 :header-rows: 1 * - Variable - Description * - PYPE_CARBON_COUNTRY - Electricity region/zone (e.g. DK, DE, FR); setting it enables tracking * - PYPE_CO2EQ_SRC - Carbon-intensity provider: ``entsoe``, ``electricitymaps`` or ``compute_bio`` * - ENTSOE_API_KEY - API token for the ``entsoe`` provider * - ELECTRICITY_MAPS_API_KEY - API token for the ``electricitymaps`` provider * - PYPE_CARBON_FALLBACK_G_PER_KWH - Static carbon intensity used when no provider value is available (default: 300.0) * - PYPE_CARBON_CPU_TDP_W - CPU TDP for the power model when idle/loaded watts are unknown (default: 100.0) * - PYPE_POWER_IDLE_W - Measured node idle power in watts (optional) * - PYPE_POWER_LOADED_W - Measured node full-load power in watts (optional) Compute.bio API Integration ---------------------------- Bio_pype can optionally integrate with compute.bio for web-based pipeline monitoring and control. Setup ^^^^^ To enable compute.bio integration: 1. Obtain an API token from your compute.bio account 2. Add configuration to ``~/.bio_pype/config``:: COMPUTE_BIO_API_URL=https://app.compute.bio/api/v1 COMPUTE_BIO_TOKEN=your_api_token_here Or set environment variables:: export COMPUTE_BIO_API_URL=https://app.compute.bio/api/v1 export COMPUTE_BIO_TOKEN=your_api_token_here Features ^^^^^^^^ When configured, Bio_pype automatically: - **Registers workers**: Each pipeline run registers with a unique worker ID - **Sends progress updates**: Pipeline status sent every 30 seconds (default) - **Tracks jobs**: Job status, queue IDs, and timestamps synced to API - **Supports log streaming**: Real-time log viewing through web interface - **Receives commands**: API can request logs and other information Configuration Options ^^^^^^^^^^^^^^^^^^^^^ Fine-tune API integration with environment variables: .. list-table:: :widths: 40 60 :header-rows: 1 * - Variable - Description * - PYPE_API_PROGRESS_INTERVAL - Seconds between progress updates (default: 30) * - PYPE_API_COMMAND_INTERVAL - Seconds between command polls (default: 60) Example:: export PYPE_API_PROGRESS_INTERVAL=60 export PYPE_API_COMMAND_INTERVAL=120 Verification ^^^^^^^^^^^^ To verify API integration is working: 1. Run a pipeline with API configured 2. Check logs for registration message:: INFO: Worker registered: worker-hostname-abc123 (Pipeline ID: 12345) INFO: Started progress watcher (updates every 30s) INFO: Started command watcher (polls every 60s) 3. View pipeline progress on compute.bio web interface Disabling API Integration ^^^^^^^^^^^^^^^^^^^^^^^^^^ API integration is disabled by default. If you want to ensure it's disabled: - Don't set ``COMPUTE_BIO_API_URL`` or ``COMPUTE_BIO_TOKEN`` - Or remove them from ``~/.bio_pype/config`` If API credentials are not configured, pipelines run normally without web monitoring:: WARNING: compute.bio API not configured. Pipeline progress will not be sent to API. CLI Commands ^^^^^^^^^^^^ Bio_pype provides CLI commands for testing and running the compute.bio integration. **Test API connectivity**:: pype compute_bio --test Tests the API connection without creating any records. Verifies that ``COMPUTE_BIO_API_URL`` and ``COMPUTE_BIO_TOKEN`` are set correctly. **Run listener daemon**:: pype compute_bio --run # With custom log directory pype compute_bio --run --log /path/to/logs Starts a persistent daemon that monitors compute.bio for commands sent to pipelines running on this machine. The daemon: - Polls for commands every 10 seconds - Processes commands for inactive workers (pipelines that finished or crashed) - Handles log requests, job cancellation, and other remote commands - Uses a lock file to prevent multiple daemons on the same machine **Daemon usage notes:** - Run in the background: ``nohup pype compute_bio --run &`` - Only one daemon should run per machine - The daemon handles commands for all pipelines registered from this host - Press Ctrl+C to stop the daemon gracefully