Snippets

A snippet is the executor of the tasks. It can be written as a markdown file, using code chunks to run arbitrary code, or it can be written as a python modules (see Advanced Snippets in Python)

Basic Snippet Structure

A full snippet has been shown already in the Running tasks in a Snippet section.

The structure of a snippet in composed by the following sections header:

  1. requirements: which include a code chunk returning a dictionary which specifies the necessary resource to run the snippet (eg. used to allocate resource in a queuing systems)
  2. results: which include a code chunk returning a dictionary listing all the files produced by the execution of the snippet
  3. arguments: a numbered list, which is interpreted by argparse to produce the command line interface of the snippet
  4. snippet: containing the code chunks with the instruction to perform the desired task
  5. name: an optional section containing a chunk returning a string with a “friendly name”. This name overrides in certain aspects the default snippet name. This can be used to identify more easily log folders and job ids running on the system.

The input and output arguments are passed to the various chunk via variable substitutions by name, a method used in python strings formatting.

In practice it means that a string %(hello)s present in a chunk, would be replace by the value of the variable hello

There are few ways of setting variables:

  1. The arguments section
  2. The profiles.files (See Profiles)
  3. The keys from the results object

The arguments variables are named after the argument name, and the value is the value passed to the the command line.

The variables from the profile and from the results section are prefixed with profile_ and results_ respectively. This means that in order to pass a key, eg. genome_fa, present in the profile.file, in the snippet chunk it corresponds to %(profile_genome_fa)s.

More detail on the argument passing in the following section

Reference arguments results and files

Using Namespaces

<<This section may go to Profiles>> The namespace are set in the profile file. Ideally the snippet should be agnostic on the final runtime execution, and it may be possible to run it as-is in different environment by only change the namespace in the profile.

More broadly the namespace is a mechanism to set the environment to where execute the chunk.

Supported namespace are:

  1. Path: assumes that the commands in the chunks are present in the environment $PATH
  2. Environment Modules: loads a set of specified modules before running the chunk
  3. Docker: run the chunk within a container image. This namespace supports also uDocker and singularity

Path

Environment Modules

Docker/Singularity/uDocker

Advanced Snippets in Python

The snippets are located in a python module (mind the __init__.py in the folder containing the snippets). In order to function, each snippet need to have 4 specific function:

  1. requirements: a function returning a dictionary with the necessary resource to run the snippet (used to allocate resource in queuing systems)
  2. results: a function accepting a dictionary with the snippet arguments and returning a dictionary listing all the files produced by the execution of the snippet
  3. add_parser: a function that implement the argparse module and defines the command line arguments accepted by the snippet
  4. a function named as the snippet file name (without the .py extension), containing the code for the execution of the tool
from pype.process import Command


def requirements():
    return({
        'ncpu': 1,
        'time': '00:01:00',
        'mem': '1gb'})


def results(argv):
    output = None
    try:
        output = argv['-o']
    except KeyError:
        try:
            output = argv['--output']
        except KeyError as e:
            raise e
    return({'file_out': output})


def add_parser(subparsers, module_name):
    return subparsers.add_parser(
        module_name, help='Test snippet example -in python-',
        add_help=False)


def test_adv_args(parser, argv):
    parser.add_argument(
        '-i', '--input', dest='input', nargs='*',
        help='input(s) text file', type=str, required=True)
    parser.add_argument(
        '-o', '--output', dest='output', type=str,
        default='output.txt', help='output file')
    return parser.parse_args(argv)


def test_adv(subparsers, module_name, argv, profile, log):
    args = test_adv_args(
        add_parser(subparsers, module_name), argv)

    dummy_file = profile.files['dummy_file']

    cmd1 = 'cat %s %s' % (
        ' '.join(args.input), dummy_file)
    cmd2 = 'awk \'{ print toupper($0) }\''
    cmd3 = 'awk \'{ print tolower($0) }\''

    cat = Command(
        cmd1, log, profile, 'cat')
    to_up = Command(
        cmd2, log, profile, 'to_up')
    to_low = Command(
        cmd3, log, profile, 'too_low')
    for input_file in args.input:
        cat.add_input(input_file)
    cat.add_input(dummy_file)
    to_low.add_output(args.output)
    cat.add_namespace(profile.programs['alpine_3'])
    to_up.add_namespace(profile.programs['alpine_3'])
    to_low.add_namespace(profile.programs['alpine_3'])
    to_up.pipe_in(cat)
    to_low.pipe_in(to_up)

    with open(args.output, 'wt') as output:
        to_low.run()
        for line in to_low.stdout:
            output.write(line.decode('utf-8'))
        to_low.close()