Using Nextflow for scientific computing¶
The CQLS recommends using Nextflow especially the nf-core pipelines.
We recommend using -profile singularity
when available, and the CQLS maintains a few settings that facilitate the use
of singularity. If you have followed the dotfile update protocol then you should have
the updated NXF_SINGULARITY_CACHEDIR
setting as shown below.
This setting allows the CQLS to download the nf-core pipeline singularity images into a place that everyone can use. Therefore, these sometimes large images are in a centralized location, which saves overall space usage across the cluster.
Tip
If you need an updated nf-core pipeline downloaded, then please submit a ticket so we can get it updated for you.
Centralized location for workflows¶
The published nextflow workflows are stored in /local/cqls/software/nextflow/assets
, and these can be specified by
full path so that the pipelines do not have to be downloaded each time when running the software. I'd suggest something
like this for your nextflow jobs:
#!/usr/bin/env bash
workflow=/local/cqls/software/nextflow/assets/nf-core-ampliseq_2.10.0/2_10_0
NXF_TMP=/scratch nextflow run \
$workflow \
-profile singularity \
-params-file nf-params.json
You can check out a cpu with the salloc
command, and then run this with bash ./run_nextflow.sh
.
Using Slurm with Nextflow¶
I've tested the below nextflow.config
file with the ampliseq pipeline.
singularity.autoMounts = true
singularity.runOptions = '-B /scratch:/tmp,/scratch'
params.max_cpus = 96
params.max_memory = '384 GB'
process.executor = "slurm"
process.queue = "core"
process.clusterOptions = "-x compute-temp-1"
Feel free to copy the above config, change the queue, and remove or change the clusterOptions setting to something more
appropriate. As above, the -x
flag specifies nodes to exclude from the workflow, so the machine compute-temp-1
was
excluded from the run.
As an example, folks from the BPP department would change "core" to "bpp" for that particular option.
When my jobs start on a machine, what determines the number of requested resources?¶
This will be determined on a per-workflow basis. Most of the nf-core pipelines have default settings that are reasonable for most people. For more information, see the advanced section.