Writing a bioinformatics workflow with Nextflow
In this recipe, we will learn about one of the most powerful, modern workflow management systems, Nextflow (https://www.nextflow.io/). We are going to implement a simple pipeline in Nextflow that does some basic bioinformatics tasks, such as trimming and quality control, as a way to learn more about it. We’ll implement FastQC and MultiQC, but we’ll just mock out the trimming step. We’ll also build an interactive dashboard to manage it in our notebook, just like the last example.
Unlike Snakemake, which is Pythonic, Nextflow is based on Java. This means you need a current version of Java installed so that you can run a Java Virtual Machine (JVM). It is based on Groovy, a Java-based programming language (https://groovy-lang.org/).
This means that Nextflow can use Java and Groovy libraries directly. Nextflow also has features for computational reuse and can scale easily from laptop to HPC to cloud.
By the end...