RNA-Seq pipeline

This pipeline shows an example of a basic for RNA-Seq analysis that performs quality control, transcript quantification, and result aggregation. The pipeline processes paired-end FASTQ files, generates quality control reports with FastQC, quantifies transcripts with Salmon, and produces a unified report with MultiQC.

// Parameter inputs
params.reads = "${baseDir}/data/ggal/ggal_gut_{1,2}.fq"
params.transcriptome = "${baseDir}/data/ggal/ggal_1_48850000_49020000.Ggal71.500bpflank.fa"
params.outdir = "results"
params.multiqc = "${baseDir}/multiqc"


// Component imports
include { RNASEQ } from './modules/rnaseq'
include { MULTIQC } from './modules/multiqc'


// Workflow block
workflow {

    log.info """\
      R N A S E Q - N F   P I P E L I N E
      ===================================
      transcriptome: ${params.transcriptome}
      reads        : ${params.reads}
      outdir       : ${params.outdir}
    """.stripIndent()

    read_pairs_ch = channel.fromFilePairs(params.reads, checkIfExists: true, flat: true)

    (fastqc_ch, quant_ch) = RNASEQ(read_pairs_ch, params.transcriptome)

    multiqc_files_ch = fastqc_ch.mix(quant_ch).collect()

    MULTIQC(multiqc_files_ch, params.multiqc)

    workflow.onComplete = {
        log.info(
            workflow.success
                ? "\nDone! Open the following report in your browser --> ${params.outdir}/multiqc_report.html\n"
                : "Oops .. something went wrong"
        )
    }
}

Synopsis

The pipeline uses two imported components:

RNASEQ: a subworkflow that contains three processes:

INDEX: creates a Salmon index from the transcriptome (runs once)

FASTQC: analyzes each sample in parallel

QUANT: quantifies transcripts for each sample after indexing completes

MULTIQC: aggregates all quality control and quantification outputs into a comprehensive HTML report

The workflow block uses channel.fromFilePairs to create a channel of paired-end read files. It passes the reads and transcriptome to the RNASEQ subworkflow, then mixes the FastQC and quantification outputs and passes them to MULTIQC.


Get started

To run this pipeline:

1. Install Nextflow (version 25.10 or later)

2. Install Docker

3. Run the pipeline directly from its GitHub repository:

nextflow run nextflow-io/rnaseq-nf -profile docker

See the rnaseq-nf GitHub repository for all of the pipeline code and the rnaseq-nf tutorial for a full pipeline description.