Often times, it is required to define a pipeline also known as workflow to achieve one's goals. The concept of workflows is used in scientific research to process large volumes of data and also in business. These workflows are managed by workflow management softwares. In scientific domain, there are plenty of offerings such as Pegasus, VisTrails, Kepler, Chimera, Krojan, Falkon, Depends, etc.
The concept of workflow can be used in one's program in which you want to define a set of activites with their inputs and output and mutual connections, and the order of execution. There are quite a few good projects in python that let you achieve this. In this post, I am highlighting a few of those which I have read a little about. The analysis of them would be the goal for next post :)
- luigi: https://github.com/spotify/luigi
- FireWorks: http://pythonhosted.org/FireWorks/
- pyutilib.workflow: https://pypi.python.org/pypi/pyutilib.workflow/3.5.1
- GoFlow: a workflow engine for Django. https://code.djangoproject.com/wiki/GoFlow
- snakemake: https://www.biostars.org/p/88277/
you can also try to write your own Workflow Manager using python. One such attempt is found here ( http://supercoderz.in/2011/11/03/building-a-simple-workflow-engine-in-python/ ). An exhaustive list of python-based workflow projects can be found from this link (https://code.activestate.com/pypm/search:workflow/)
No comments:
Post a Comment