CHANGELOG.md

# demultiplex v0.4.4

## Bug fixes

* Only add the `transfer_complete.txt` files when the exitcode for the workflow is 0 (PR #58)

# demultiplex v0.4.3

## Minor changes

* The `runner` creates a `transfer_completed.txt` file when the publishing of the output has finished (PR #57).

# demultiplex v0.4.2

## Minor changes

* Provide output from `runner` workflow so it can be used as part of a larger workflow (PR #56).
* Add workflow identifier to version information during pipeline run (PR #56).

# demultiplex v0.4.1

## Minor changes

* Split off part of the workflow logic (`detect_demultiplexer`) from the main workflow to a dedicated subworkflow (PR #52).
* Add the package config (`_viash.yaml`) to every component's target dir. This makes introspection from, e.g. a `runner` workflow much more robust (PR #53).

# demultiplex v0.4.0

## Breaking changes

* Falco has been replaced with FastQC. Falco generates FastQC compatible output, but fails to run on empty FASTQ files (PR #51).
  - `runner` workflow: `falco_output` has been renamed to `output_sample_qc`.
  - `demultiplex` workflow: `output_falco` has been renamed to `output_sample_qc`.
  - The output file names from the sample QC no longer contains the input file extensions. Instead, the sample name is used.
  (for example `sample1_S1_R2_001.fastq.gz_fastqc_report.html` becomes `sample1_S1_R2_001_fastqc_report.html`)

* `demultiplex` workflow: `output_multiqc` argument has been renamed to `multiqc_output` in order to align inner workflow and runner (PR #51).

# demultiplex v0.3.12

## New features

* Add support for Nextflow versions version starting 25.xx.xx (PR #50).

## Bug fixes

* Allow FASTQ files for `Undetermined` to be empty (PR #50).  

# demultiplex v0.3.11

## New features

* Output demultiplexer logs and metrics (PR #41).

# demultiplex v0.3.10

## Minor changes

* Moved the test resources to their new location (PR #37).

# demultiplex v0.3.9

## Bug fixes

* Fix defaults for output arguments in nextflow schema's.

* Fix an issue where an integer being passed to a argument with `type: double` resulted in an error (PR #44).

## Minor changes

* Bump viash to 0.9.4, which adds support for nextflow versions starting major version 25.01 (PR #43 and #44).

# demultiplex v0.3.8

## Bug fixes

* Provide a proper error when a FASTQ file is empty after demultiplexing (PR #40).

# demultiplex v0.3.7

## Minor updates

* Ignore lines starting with '#' when parsing run information CSV (PR #39).

# demultiplex v0.3.6

## Minor updates

* Allow letter case variants for headers when looking for sample information in run information CSV (PR #38).

# demultiplex v0.3.5

## Breaking changes

* The `demultiplex` workflow now outputs a list of directories
  for the `output_falco` argument (one for each barcode) instead of one directory
  for the complete run. The output from the `runner` workflow remained
  unchanged (PR #33).

## Minor updates

* In case Illumina data is detected in the input folder, check for the presence of the 'copyComplete.txt' file.
  This check can be disabled using `--skip_copycomplete_check` (PR #34).

# demultiplex v0.3.4

## Minor updates

* Resource labels are now automatically included during build (PR #32).

# demultiplex v0.3.3

## Breaking change

- The `runner` defines the output differently now:

  - The last part of the `--input` path is expected to be the run ID and this run ID is used to create the output directory.
  - If the input is `file.tar.gz` instead of a directory, the `file` part is used as the run ID.

- The output structure is then as follows:

    ```
    $publish_dir/<run_id>/<date_time_stamp>_demultiplex_<version>/
    ```

    For instance:

    ```
    $publish_dir
    └── 200624_A00834_0183_BHMTFYDRXX
        └── 20241217_051404_demultiplex_v1.2
            ├── run_information.csv
            ├── fastq
            │   ├── Sample1_S1_L001_R1_001.fastq.gz
            │   ├── Sample23_S3_L001_R1_001.fastq.gz
            │   ├── SampleA_S2_L001_R1_001.fastq.gz
            │   ├── Undetermined_S0_L001_R1_001.fastq.gz
            │   └── sampletest_S4_L001_R1_001.fastq.gz
            └── qc
                ├── fastqc
                │   ├── Sample1_S1_L001_R1_001.fastq.gz_fastqc_data.txt
                │   ├── Sample1_S1_L001_R1_001.fastq.gz_fastqc_report.html
                │   ├── Sample1_S1_L001_R1_001.fastq.gz_summary.txt
                │   ├── Sample23_S3_L001_R1_001.fastq.gz_fastqc_data.txt
                │   ├── Sample23_S3_L001_R1_001.fastq.gz_fastqc_report.html
                │   ├── Sample23_S3_L001_R1_001.fastq.gz_summary.txt
                │   ├── SampleA_S2_L001_R1_001.fastq.gz_fastqc_data.txt
                │   ├── SampleA_S2_L001_R1_001.fastq.gz_fastqc_report.html
                │   ├── SampleA_S2_L001_R1_001.fastq.gz_summary.txt
                │   ├── Undetermined_S0_L001_R1_001.fastq.gz_fastqc_data.txt
                │   ├── Undetermined_S0_L001_R1_001.fastq.gz_fastqc_report.html
                │   ├── Undetermined_S0_L001_R1_001.fastq.gz_summary.txt
                │   ├── sampletest_S4_L001_R1_001.fastq.gz_fastqc_data.txt
                │   ├── sampletest_S4_L001_R1_001.fastq.gz_fastqc_report.html
                │   └── sampletest_S4_L001_R1_001.fastq.gz_summary.txt
                └── multiqc_report.html

    ```

- This logic can be avoided by providing the flag `--plain_output`.

# Minor updates

* Added `output_run_information` argument that copies the run information file to the output (PR #31).

# demultiplex v0.3.2

# Bug fixes

* Ignore empty CSV entries when parsing sample information (PR #29).

# demultiplex v0.3.1

# Minor updates

* Add `--run_information` and `--demultiplexer` arguments to `runner` workflow (PR #27).

# Bug fixes

* Fix detection of sample IDs from Illumina V2 sample sheets (PR #28).

* Provide a clear error message when `--run_information` is provided but not `--demultiplexer` (PR #27).

# demultiplex v0.3.0

## Major updates

The outflow of the workflow has been refactored to be more flexible (PR #19). This is done by creating a wrapper workflow `runner` that wraps the native `demultiplex` workflow. The `runner` workflow is responsible for setting the output directory based on the input arguments:

3 arguments exist for specifying the relative location of the 3 _outputs_ of the workflow:

- `fastq_output`: The directory where the demultiplexed fastq files are stored.
- `falco_output`: the directory for the `fastqc`/`falco` reports.
- `multiqc_output`: The filename for the `multiqc` report.

The target location path is determined by the following logic:

- If no `id` is provided, the output directory is set to `$publish_dir`.
- If an `id` is explicitly set using Seqera Cloud or by adding `--id <>`, the output directory is set to `$publish_dir/<id>`.

The workflow has two optional flags to be used in combination with `--id`:

- `--add_date_time`: rather than publishing the results under `$publish_dir`, this adds an additional layer `$publish_dir/<date-time-stamp>/`. This is useful when you want to keep track of multiple runs of the workflow (example: `240322_143020`).
- `--add_workflow_id`: adding this flag will add `_demultiplex_<version>` to the output directory (example: `demultiplex_v0.2.0`). When starting the workflow from a non-release, the version will be set to `version_unkonwn`.

The default structure in the output directory is:

- Two sub-directories:
  - `fastq`
  - `qc` for the reports:
    - `multiqc_report.html`
    - `fastqc/` directory containing the different fastqc (falco) reports.

The `$publish_dir` variable corresponds to the argument provided with `--publish-dir`. The `date-time-stamp` is generated by the workflow based on when it was launched and is thus guaranteed to be unique.

# demultiplex v0.2.0

## Breaking changes

* `demultiplex` workflow: renamed `sample_sheet` argument to `run_information` (PR #24)

## New features

* Add support for `bases2fastq` demultiplexer (PR #24)

## Minor updates

* Add resource labels to workflows (PR #21).

# demultiplex v0.1.1

## Minor updates

* Bump viash to 0.9.0 (PR #14).

* `demultiplex` workflow: use `v0.2.0` release instead of `main` branch for `biobox` dependencies (PR #11).

* Renamed `biobase` repository to `biobox` (PR #13 and PR #15).

# demultiplex v0.1.0

Initial release