Writing a bioinformatics workflow with Snakemake
Here, we will learn about a popular bioinformatics workflow management tool called Snakemake (https://snakemake.readthedocs.io/en/stable/).
Snakemake is implemented in Python and shares many traits with it. That being said, its fundamental inspiration is a Makefile, the framework used by the venerable “make” building system - https://en.wikipedia.org/wiki/Make_(software). Snakemake has a few advantages over other workflow systems: first, it is Pythonic (easy to develop if you know Python); second, its Makefile-based approach is straightforward. What makes Snakemake unique is that it works in reverse order to interpret rules in Makefiles. Snakemake will not rerun a rule if the output file is already present unless you force it to do so (also called computational reuse). This is a big advantage if you are using EC2 Spot Instances in AWS, which are “spare” instances obtained at a fraction of the normal cost...