Understanding Bio_pype Logs

Log Structure

Bio_pype organizes logs hierarchically in PYPE_LOGDIR (default: ~/.bio_pype/logs):

logs/
├── 231205143022_pipeline_name/    # Unique timestamp + pipeline name
│   ├── pipeline.log               # Main pipeline log
│   └── jobs/                      # Individual job logs
│       ├── abc123_step1_slurm/    # Job ID + step name + queue
│       │   ├── job.log           # Job-specific log
│       │   └── queue.log         # Queue system output
│       └── def456_step2_slurm/
│           ├── job.log
│           └── queue.log
└── 231205143159_another_run/
    └── ...

Log File Types

pipeline.log

Contains overall pipeline execution information: - Arguments and configuration used - Step execution order - Dependencies between steps - Resource allocations - Final status

Example pipeline.log:

2023-12-05 14:30:22 INFO: Starting pipeline rev_compl_low_fa
2023-12-05 14:30:22 INFO: Using profile: local
2023-12-05 14:30:23 INFO: Submitting step reverse_fa to queue slurm
2023-12-05 14:35:45 INFO: Step reverse_fa completed successfully
...

job.log

Contains snippet-specific information: - Input validation - Command execution - Output generation - Resource usage

Example job.log:

2023-12-05 14:30:23 INFO: Processing input file: sample1.fa
2023-12-05 14:30:23 INFO: Command: reverse_fa -i sample1.fa -o reversed.fa
2023-12-05 14:30:24 INFO: Generated output: reversed.fa
2023-12-05 14:30:24 INFO: Peak memory usage: 2.1GB

queue.log

Contains queue system output: - Job submission details - Resource allocation - Error messages - Exit codes

Example queue.log (SLURM):

Submitted batch job 123456
slurmstepd: Job 123456 started on node034
...
slurmstepd: Job 123456 completed with exit code 0

Useful Log Commands

Find recent pipeline runs:

ls -ltr ~/.bio_pype/logs/

Monitor active pipeline:

tail -f ~/.bio_pype/logs/latest/pipeline.log

Check failed jobs:

grep -r "ERROR" ~/.bio_pype/logs/*/jobs/*/job.log

Get resource usage:

grep "memory usage" ~/.bio_pype/logs/*/jobs/*/job.log

Debug Tips

  1. Start with pipeline.log for overall status

  2. Check job.log for specific step failures

  3. See queue.log for system-level issues

  4. Use PYPE_DEBUG=1 for verbose logging:

    export PYPE_DEBUG=1
    pype pipelines my_pipeline ...
    

Common Issues

Memory Errors:

2023-12-05 14:35:45 ERROR: MemoryError in step bwa_mem
2023-12-05 14:35:45 INFO: Peak memory: 32GB, Allocated: 16GB

Solution: Increase memory in snippet requirements or profile

Queue Timeouts:

2023-12-05 15:00:00 ERROR: Job exceeded walltime limit

Solution: Adjust walltime in queue configuration