Go to file

CI 475d18a6a9 Build branch update-readme with version update-readme (fb376b1)

Build pipeline: viash-hub.htrnaseq.update-readme-wf7cg

Source commit: fb376b1712

Source message: Experiment wit size of figure

2025-05-02 18:02:28 +00:00

assets

Build branch update-readme with version update-readme (fb376b1)

2025-05-02 18:02:28 +00:00

src

Build branch update-readme with version update-readme (db4b3f6)

2025-05-02 16:58:40 +00:00

target

Build branch update-readme with version update-readme (fb376b1)

2025-05-02 18:02:28 +00:00

_viash.yaml

Build branch update-readme with version update-readme (fb376b1)

2025-05-02 18:02:28 +00:00

.gitignore

Build branch update-readme with version update-readme (db4b3f6)

2025-05-02 16:58:40 +00:00

CHANGELOG.md

Build branch update-readme with version update-readme (db4b3f6)

2025-05-02 16:58:40 +00:00

main.nf

Build branch update-readme with version update-readme (db4b3f6)

2025-05-02 16:58:40 +00:00

nextflow.config

Build branch update-readme with version update-readme (db4b3f6)

2025-05-02 16:58:40 +00:00

README.md

Build branch update-readme with version update-readme (fb376b1)

2025-05-02 18:02:28 +00:00

README.qmd

Build branch update-readme with version update-readme (fb376b1)

2025-05-02 18:02:28 +00:00

README.md

HT-RNAseq

Introduction

This workflow is designed to process high-throughput RNA-seq data, where every well of a microarray plate is a sample. A fasta file provided as input defines the mapping between sample barcodes and wells.

The workflow is built in a modular fashion, where most of the base functionality is provided by components from biobox supplemented by custom base components and workflow components in this package.

The full workflow is split in two major subworkflows that can be run independently:

Well-demultiplexing: Split the input (plate/pool level) fastq files per well.
Mapping, counting and QC: Run per-well mapping, counting and generate QC reports.

Each of those can be started individually, or the full workflow can be run in two ways:

Run the main workflow containing the main functionality.
Run the (opinianated) runner where a number of choices (input/output structure and location) have been made.

Input for the workflow has to be fastq files (zipped or not). For bcl or other formats, please consider running demultiplex first.

Example usage

Test and example data

If you want to explore this workflow, it’s possible to the use data we use as test data: a DRUGseq dataset from the NCBI Sequence Read Archive. For the unit and integration tests, this data has been (partly) subsampled to reduce the test runtime. We used seqtk for this with a seed of 1, e.g.:

seqtk sample -s1 orig/SRR14730302/VH02001614_S8_R1_001.fastq.gz 10000 > 10k/SRR14730302/VH02001614_S8_R1_001.fastq.gz

This data is available at: gs://viash-hub-test-data/htrnaseq/v1/.

Run from Viash Hub

Open Viash Hub and browse to the htrnaseq component. Press the ‘Launch’ button and follow the instructions.

We will start an example run loading just one input and using a barcodes fasta file containing only 2 wells.

In the first step, we add the local profile to the list of profiles in order to limit the cpu and memory requirements of the workflow steps:

In the next step, we provide the paramters as follows:

input_r1: gs://viash-hub-test-data/htrnaseq/v1/100k/SRR14730301/VH02001612_S9_R1_001.fastq
input_r2: gs://viash-hub-test-data/htrnaseq/v1/100k/SRR14730301/VH02001612_S9_R2_001.fastq
genomeDir: gs://viash-hub-test-data/htrnaseq/v1/genomeDir/subset/Homo_sapiens/v0.0.3/
barcodesFasta: gs://viash-hub-test-data/htrnaseq/v1/2-wells-with-ids.fasta
annotation: gs://viash-hub-test-data/htrnaseq/v1/genomeDir/gencode.v41.annotation.gtf.gz

Please note that both input_r1 and input_r2 can take multiple values. This means that one has to press ENTER after pasting the input path.

Press the ‘Launch’ button at the end to get the instructions on how to run the workflow from the CLI.

Run using NF-Tower / Seqera Cloud

It’s possible to run the workflow directly from Seqera Cloud. The necessary schema file has been built and provided with the workflows in order to use the form-based input. However, Seqera Cloud can not deal with multiple-value parameters when using the form -based input.

It’s better to use Viash Hub also here:

First, select the option to run the workflow using Seqera Cloud. You will need to create an API token for your account. Once this token is filled in in the corresponding field, you will get the option to select a ‘Workspace’ and a ‘Compute environment’.

Next, we need to fill in the paramters for the run. This is similar to before:

In the next screen, pressing the ‘Launch’ button will actually start the workflow on Seqera Cloud. A message is shown when the submit was successful.

## Run from the CLI

Running from the CLI directly without using Viash hub is possible. The easiest is to just use the integrated help functionality, for instance using the following:

 nextflow run https://packages.viash-hub.com/vsh/htrnaseq.git \
  -revision v0.3.0 \
  -main-script target/nextflow/workflows/runner/main.nf \
  --help

Contributions

Developed in collaboration with Data Intuitive and Open Analytics.

Other contributions are welcome.

README.md Unescape Escape

HT-RNAseq

Introduction

Example usage

Test and example data

Run from Viash Hub

Run using NF-Tower / Seqera Cloud

Contributions

README.md