Get started¶
Requirements¶
Bio-pype is a python package available from the python package index. So Python is required to use the software. Only Python3 (Python >= 3.4) is supported.
Note
Earlier version of bio-pype supported python2.7, however after the official sunsetting of python2 in 2020 and the increasing divergence of the legacy python2 from python3, bio-pype will support only python3 from version 2.0.0
Installation¶
Note
It is strongly advised to use virtualenv to install the module locally.
From pip:¶
Tip
The installation from pip may not include the latest fixes/features but it is generally a good choice for a production environment.
pip install bio_pype
From git:¶
git clone https://bitbucket.org/ffavero/bio_pype
cd bio_pype
python -m unittest discover
python setup.py install
Running tasks in a Snippet¶
You need to configure bio_pype to match your system setting (see Configuration).
Additionally, in most cases, the profiles files section needs to be adjusted to match your system file structure. See Profiles for details.
In short, something like the following highlighted section shows a minimal setup to use pype in the example code for this documentation:
Minimal pype setup
In this documentation we set the root location for the pype modules to the repository test data:
$ mkdir ~/.bio_pype
$ echo "PYPE_MODULES=`dirname $PWD`/test/data" > ~/.bio_pype/config
This results in the config file:
$ cat ~/.bio_pype/config
PYPE_MODULES=/home/docs/checkouts/readthedocs.org/user_builds/bio-pype/checkouts/latest/test/data
Additionally, we also need to replace the dummy_file value in the profile files to match the path of test/data/files/dummy_file.txt in the repository test data:
$ sed -i "s,/just/a/dummy/file/for/testing.txt,`dirname $PWD`/test/data/files/dummy_file.txt," ../test/data/profiles/test_path.yaml
$ sed -i "s,/just/a/dummy/file/for/testing.txt,`dirname $PWD`/test/data/files/dummy_file.txt," ../test/data/profiles/test_docker.yaml
After the configuration of pype with the desired modules folders, copy the following lines as a file named test_base.md (or any other name. The final snippet name corresponds to the file name -minus the .md extension-) in the snippet folder.
# Example test Snippet
The snippet in pype is given by the file name
(minus the `.md` extension)
## description
Test snippet example
## requirements
```yaml
ncpu: 1
time: '00:01:00'
mem: 1gb
```
## results
```bash
@/bin/sh, yaml
printf 'file_out: %(output)s'
```
## arguments
1. input/i
- help: input(s) text file
- type: str
- required: true
- nargs: *
2. output/o
- help: output file
- type: str
- default: output.txt
## snippet
> _input_: input profile_dummy_file*
```bash
@/bin/sh, chk1, stdout=chk2, namespace=alpine_3
files_input='%(input)s'
dummy_file='%(profile_dummy_file)s'
cat $files_input $dummy_file | awk '{ print toupper($0) }'
```
> _output_: results_file_out
```bash
@/bin/sh, chk2, namespace=alpine_3
awk '{ print tolower($0) }' > '%(output)s'
```
For more detailed information on how to write snippets and on their structure see Snippets
The snippets are run via the pype command line:
$ pype
usage: pype [-p PROFILE] {pipelines,profiles,repos,snippets} ...
A python pipeliens manager oriented for bioinformatics
positional arguments:
pipelines Workflows built combining pipelines and snippets
profiles Reference paths and softwares to use in snippets
repos Manage pype modules
snippets Execute tasks
optional arguments:
-p PROFILE, --profile PROFILE
Choose the pype profile from the available options (see
pype profiles). Default: test_docker
This is version 1.9.99 - Francesco Favero - 12 December 2020
Using the snippets sub-command
$ pype snippets
usage: pype snippets [--log LOG]
{complement_fa,hello,lower_fa,merge_fa,reverse_fa,test_adv,test_base}
...
positional arguments:
{complement_fa,hello,lower_fa,merge_fa,reverse_fa,test_adv,test_base}
complement_fa lower case a fasta sequence
hello hello world with flag
lower_fa lower case a fasta sequence
merge_fa Concatenate a series of files into a single one
reverse_fa reverse a fasta sequence
test_adv Test snippet example -in python-
test_base Test snippet example
optional arguments:
--log LOG Path used for the snippet logs. Default:
/home/docs/.bio_pype/logs
The selected snippet will prompt the command line interface according to the arguments section in the snippet:
$ pype snippets test_base
INFO : 2021-01-26 21:06:03,400 : Writing logs to folder /home/docs/.bio_pype/logs/test_base
INFO : 2021-01-26 21:06:03,400 : Using profile test_docker
INFO : 2021-01-26 21:06:03,400 : Prepare snippet test_base
INFO : 2021-01-26 21:06:03,400 : Attempt to execute snippet test_base
error: the following arguments are required: --input/-i
usage: pype snippets test_base --input [INPUT [INPUT ...]] [--output OUTPUT]
optional arguments:
--input [INPUT [INPUT ...]], -i [INPUT [INPUT ...]]
input(s) text file
--output OUTPUT, -o OUTPUT
output file
Now that we know how to use it, we may run the snippets using any text file available at hand:
$ pype -p test_path snippets test_base -i ../test/data/files/input.fa
INFO : 2021-01-26 21:06:03,643 : Writing logs to folder /home/docs/.bio_pype/logs/test_base
INFO : 2021-01-26 21:06:03,643 : Using profile test_path
INFO : 2021-01-26 21:06:03,644 : Prepare snippet test_base
INFO : 2021-01-26 21:06:03,644 : Attempt to execute snippet test_base
INFO : 2021-01-26 21:06:03,648 : Write chunk chk1 code into /home/docs/.bio_pype/logs/test_base/210126210603.644268_69OX_test_base_chk1
INFO : 2021-01-26 21:06:03,649 : Set namespace to path
INFO : 2021-01-26 21:06:03,649 : Write chunk chk2 code into /home/docs/.bio_pype/logs/test_base/210126210603.644324_FWSI_test_base_chk2
INFO : 2021-01-26 21:06:03,650 : Set namespace to path
INFO : 2021-01-26 21:06:03,650 : Pipe in chk1 in chk2 command
INFO : 2021-01-26 21:06:03,650 : Prepare chk1 command line
INFO : 2021-01-26 21:06:03,650 : /home/docs/.bio_pype/logs/test_base/210126210603.644268_69OX_test_base_chk1
INFO : 2021-01-26 21:06:03,650 : Execute chk1 with python subprocess.Popen
INFO : 2021-01-26 21:06:03,651 : Prepare chk2 command line
INFO : 2021-01-26 21:06:03,651 : /home/docs/.bio_pype/logs/test_base/210126210603.644324_FWSI_test_base_chk2
INFO : 2021-01-26 21:06:03,651 : Execute chk2 with python subprocess.Popen
INFO : 2021-01-26 21:06:03,653 : Close chk1 stdout stream
INFO : 2021-01-26 21:06:03,739 : Terminate chk2
INFO : 2021-01-26 21:06:03,740 : Snippet test_base executed
inspecting the output:
$ tail -n 2 output.txt
cggacacccagaagtctacatcctaattctc
this is a dummy file!
how changed from the original file:
$ tail -n 2 ../test/data/files/input.fa
TCAACACCACCTTCTTTGACCCAGCAGGAGGAGGAGACCCAGTACTATACCAGCACCTATTCTGATTCTT
CGGACACCCAGAAGTCTACATCCTAATTCTC
This was an amazingly useless task, but the snippet showcased the use of the profiles, enabling to reuse the same code in different environments (eg using -p test_docker instructs to use docker to run the code in the snippet chunks), abstracting technical amenities.
Now that you can run a simple task using profiles and snippets, you can chain tasks to form complex dependencies using Pipelines.