A simple and lightweight python framework to organize bioinformatics analyses.
Bioinformatics tasks require a long list of parameters: genome files, indexes, database, results of previous analyses and many more. It is not simple to keep records of version of softwares and databases and guarantee the reproducibility of an analysis.
This framework facilitate the organization of programs and files (eg, genomes
and databases) in structured YAML files, defined in the framework interface as
In addition, it provides a way to organize and execute bioinformatic tasks: each
base task is wrapped in a python script, called snippets. A snippet take
advantage of the profile, as a result, the number of arguments required to
execute the script are greatly diminished.
A number of snippets can be chained in a YAML file, to form a pipeline.
A pipeline is a series of tasks that need to be executed in a given order, by a
queuing system (eg, serial execution, parallel execution, torque/moab…).
Snippets and pipelines take advantage of the
argparse python library to
provide command line documentation upon execution.