Skip to main content

doit task creation

On my previous post I explained one way to create tasks for doit in a higher level than the default plain python dictionaries.

I just pushed a change that will allow the creation of not only decorators to generate tasks but also classes and instance objects.

The idea is very simple. Apart from collect functions that start with the name task_. The doit loader will now also execute the create_doit_task callable from any object that contains this attribute.

Note you will need the developement version of doit to run this code.

decorator example

##########################################################
## the decorator

def task(*args, **meta):
    def decorated(func):
        def create():
            task_dict = {'basename': func.__name__, 'actions': [func]}
            task_dict.update(meta)
            return task_dict
        func.create_doit_tasks = create
        return func

    if args:
        # decorator without parameters
        return decorated(args[0])
    else:
        return decorated


###########################################################
## task definition

DOIT_CONFIG = {'verbosity': 2}


@task
def simple():
    print "ho"

@task(file_dep=['dodo.py'])
def hello():
    print "hi"

class example

This interface was suggested by Thomas here.

class Task(object):
    @classmethod
    def create_doit_tasks(cls):
        if cls is Task:
            return # avoid create tasks from base class 'Task'
        instance = cls()
        kw = dict((a, getattr(instance, a)) for a in dir(instance) if not a.startswith('_'))
        kw.pop('create_doit_tasks')
        if 'actions' not in kw:
            kw['actions'] = [kw.pop('run')]
        if 'doc' not in kw:
            kw['doc'] = cls.__doc__
        return kw



class hello(Task):
    """Hello from Python."""
    targets = ['hello.txt']

    def run(self):
        with open(self.targets[0], "a") as output:
            output.write("Hello world.")

class checker(Task):
    """Run pyflakes."""
    actions = ['pyflakes sample.py']
    file_dep = ['sample.py']

Object example

class TaskHello(object):
    REG = [] # save instances

    def __init__(self, name):
        self.name = name
        self.REG.append(self)

    def say_hello(self):
        print "hello", self.name

    @classmethod
    def create_doit_tasks(cls):
        for inst in cls.REG:
            yield {
                'basename': inst.name,
                'actions': [inst.say_hello],
                }

######################

DOIT_CONFIG = {'verbosity': 2}

for name in ('foo', 'bar', 'spam'):
    TaskHello(name)

doit task decorator

doit uses plain python dictionaries to define tasks. Many, many people ask for a decorator based syntax. So here it is:

##########################################################
## the decorator

def task(*args, **meta):
    """decorator to attach task metadata to a function
    decorated function will become the task's action
    """
    def make_task(func):
        func._doit_task = True
        func._doit_meta = meta
        return func

    if args:
        # decorator without parameters
        return make_task(args[0])
    else:
        # decorator with task metadata
        return make_task


###########################################################
## task definition

DOIT_CONFIG = {'verbosity': 2}


@task
def simple():
    print "ho"

@task(file_dep=['dodo.py'])
def hello():
    print "hi"




############################################################
### boilerplate - convert decorated stuff to default doit style tasks

def task_all():
    for name, obj in globals().iteritems():
        if getattr(obj, '_doit_task', False):
            task_dict = {'basename': name, 'actions': [obj]}
            task_dict.update(obj._doit_meta)
            yield task_dict

Note that this will work with any version of doit there is no need to have any modification in doit internal code.

doit is a generic tool that aims to by different kinds of applications. There is no one task definition interface that will make everyone happy. doit provides the most basic and flexible one based on dicts...

I must agree that this decorator interface looks more readable :) But it has many limitations and can be used only for trivial tasks.

Limitations:

  • only one action per task
  • no support for command line (shell) actions
  • not easy to create several tasks with same actions

You could go one step further and create a custom task loader, so you could get rid of the task_all. I might add something like this for next doit release...

power up your tools

Goal

pyflakes is a static checker for python. It is great but it lacks a few features:

  • no parallel multi-processing execution
  • no cache to avoid checking files that were not modified
  • no option to have a long running process that watches the file system and automatically re-executes the checker when files are modified

These features are not specific to a static checker and are nice to have to many other tools. This post shows how to create an application adding those features. It uses pyflakes as an example but it could be applied to other tools also.

static checker tasks

doit is an "automation tool" that can execute tasks and provide the features which we are interested into. So the first step is to transform pyflakes operations into doit tasks.

In doit a task is composed of actions, and some other metadata.

actions

The action describes what the task does (some code to be executed). It can be a python function...

For pyflakes the function pyflakes.scripts.pyflakes.checkPath checks a file and returns the number of flakes or warnings found, so it is successful when zero flakes were found.

The action's return value is used to indicate if the execution was successful or not. So the action must return True if it has zero flakes

pyflakes checker action:

from pyflakes.scripts.pyflakes import checkPath

def check_path(filename):
    return not bool(checkPath(filename))

dependencies

A dependency indicates something that is required for a task execution.

For the pyflakes task, the file being checked is a file dependency for the task. That is, the task uses the file as input for its execution.

Explicit information of task dependencies are important for two factors:

1) cache results, if dependencies are not modified since last time the task was executed. It can use the results saved in a cache instead of re-executing the task.

2) execution order, when dealing with the execution of several tasks, the dependencies contains information on the order that tasks should be executed and whether they can be executed simultaneously (in parallel).

task creation

In doit usually task creation is done in a python module. This module functions that return/yield new tasks as a dict with metadata. It can also contain some extra configuration.

Something like:

import os
import glob

from pyflakes.scripts.pyflakes import checkPath


DOIT_CONFIG = {
    # output from actions should be sent to the terminal/console
    'verbosity': 2,
    # does not stop execution on first task failure
    'continue': True,
    # doit itself should not produce any output (use only actions output)
    'reporter': 'zero',
    # use multi-processing / parallel execution
    'num_process': 2,
    }


def check_path(filename):
    """execute pyflakes checker"""
    return not bool(checkPath(filename))

def task_pyflakes():
    """generate task for each file to be checked"""
    for filename in glob.glob('*.py'):
        path = os.path.abspath(filename)
        yield {
            'name': path, # name required to identify tasks
            'file_dep': [path], # file dependency
            'actions': [(check_path, (filename,)) ],
            }

If you drop the content above in a file called dodo.py in a folder. Executing doit from the command line you would execute pyflakes in all python modules in that folder.

It already has multi-process support. And it would execute only tasks that were failing or which checked module file was changed.

To execute in a long running process that watched the file system and re-execute tasks you could execute it as doit auto.

creating a new tool

Creating tasks in dodo.py works fine if you are working on your own project. But sometimes you just want to create a new application that you can easily distribute to other users without requiring them to add any special file into their project.

doit latest release (0.18) has exposed some of its internal API so you can create new applications and still use its task execution model.

The idea is that instead of loading tasks from a dodo.py module, the application itself create tasks and execute doit.

custom task loader

To create a custom task loader you should subclass doit.cmd_base.TaskLoader and implement the method load_tasks

pyflakes command line is extremely simple. It doesn't even support a --help option. It only takes positional parameter that specify python modules or folders containing python modules.

The first change from the previous example using dodo.py is to get a list of files in the same way pyflakes does instead of just getting all modules from a folder...

So on __init__ the loader will find the list of files to be checked.

from doit.cmd_base import TaskLoader

class FlakeTaskLoader(TaskLoader):
    """create pyflakes tasks on the fly based on cmd-line arguments"""

    def __init__(self, args):
        """set list of files to be checked
        @param args (list - str) file/folder path to apply pyflakes
        """
        self.files = []
        for arg in args:
            if os.path.isdir(arg):
                for dirpath, dirnames, filenames in os.walk(arg):
                    for filename in filenames:
                        if filename.endswith('.py'):
                            self.files.append(os.path.join(dirpath, filename))
            else:
                self.files.append(arg)

        for path in self.files:
            if not os.path.exists(path):
                sys.stderr.write('%s: No such file or directory\n' % path)

Than we need to change the function generating tasks to use the computed files instead of using glob.

    @staticmethod
    def check_path(filename):
        """execute pyflakes checker"""
        return not bool(checkPath(filename))

    def _gen_tasks(self):
        """generate doit tasks for each file to be checked"""
        for filename in self.files:
            path = os.path.abspath(filename)
            yield {
                'name': path,
                'file_dep': [path],
                'actions': [(self.check_path, (filename,))],
                }

Usually doit creates a file named .doit.db to store some information that will be used to determine which tasks are up-to-date. This file is created in the same location path as the dodo.py file. Since our new application works independent from the current path we specify the store file (dep_file) to be saved in the user's directory.

    DOIT_CONFIG = {
        'verbosity': 2,
        'continue': True,
        'reporter': 'zero',
        'dep_file': os.path.join(os.path.expanduser("~"), '.doflakes'),
        'num_process': 2,
        }

And finally we implement the TaskLoader interface:

    def load_tasks(self, cmd, params, args):
        """implements loader interface, return (tasks, config)"""
        return generate_tasks('pyflakes', self._gen_tasks()), self.DOIT_CONFIG

The command line implementation we just need to pass the positional parameters to the custom task loader. We also need to add a -w command line option to use the watch mode where the process keeps waiting for modifications in the file system.

from doit.doit_cmd import DoitMain

if __name__ == "__main__":
    cmd = 'run' # default doit command
    args = [] # list of positional args from cmd line
    for arg in sys.argv[1:]:
        if arg == '-w': # watch for changes
            cmd = 'auto'
        else:
            args.append(arg)
    doit_main = DoitMain(FlakeTaskLoader(args))
    sys.exit(doit_main.run([cmd]))

The full code can be found here doflakes.py .

power up your tools!

Using doit you can easily add some nice features to external tools or leverage its power in the tools you are the author!

doit has a very flexible dependency system that can be used to perform tasks much more complex than simple static checkers.

For more information check doit website.