Using Nextflow for scientific computing¶

The CQLS recommends using Nextflow especially the nf-core pipelines.

We recommend using -profile singularity when available, and the CQLS maintains a few settings that facilitate the use of singularity. If you have followed the dotfile update protocol then you should have the updated NXF_SINGULARITY_CACHEDIR setting as shown below.

➜ echo $NXF_SINGULARITY_CACHEDIR
/local/cqls/singularity/nxf

This setting allows the CQLS to download the nf-core pipeline singularity images into a place that everyone can use. Therefore, these sometimes large images are in a centralized location, which saves overall space usage across the cluster.

Tip

If you need an updated nf-core pipeline downloaded, then please submit a ticket so we can get it updated for you.

Centralized location for workflows¶

The published nextflow workflows are stored in /local/cqls/software/nextflow/assets, and these can be specified by full path so that the pipelines do not have to be downloaded each time when running the software. I'd suggest something like this for your nextflow jobs:

run_nextflow.sh

#!/usr/bin/env bash

workflow=/local/cqls/software/nextflow/assets/nf-core-ampliseq_2.10.0/2_10_0

NXF_TEMP=/scratch nextflow run \
        $workflow \
        -profile singularity \
        -params-file nf-params.json

You can check out a cpu with the salloc command, and then run this with bash ./run_nextflow.sh.

Using Slurm with Nextflow¶

I've tested the below nextflow.config file with the ampliseq pipeline.

nextflow.config

singularity.autoMounts = true
singularity.runOptions = '-B /scratch:/tmp,/scratch'
params.max_cpus = 96
params.max_memory = '384 GB'
process.executor = "slurm"
process.queue    = "core"
process.clusterOptions = "-x compute-temp-1"

Feel free to copy the above config, change the queue, and remove or change the clusterOptions setting to something more appropriate. As above, the -x flag specifies nodes to exclude from the workflow, so the machine compute-temp-1 was excluded from the run.

As an example, folks from the BPP department would change "core" to "bpp" for that particular option.

When my jobs start on a machine, what determines the number of requested resources?¶

This will be determined on a per-workflow basis. Most of the nf-core pipelines have default settings that are reasonable for most people. For more information, see the advanced section.