Build branch main with version main (861319f)

Build pipeline: vsh-ci-template-8vdc6

Source commit: 861319f5dd

Source message: Merge pull request #4 from viash-hub/update-star-align-params

update param
This commit is contained in:
CI
2025-02-03 16:59:21 +00:00
parent edccce830a
commit 5b1018434e
34 changed files with 5633 additions and 5991 deletions

88
README.md Normal file
View File

@@ -0,0 +1,88 @@
# 🛝📦 playground
[![GitHub](https://img.shields.io/badge/GitHub-viash--hub%2Fplayground-blue.png)](https://github.com/viash-hub/playground)
[![Viash
version](https://img.shields.io/badge/Viash-v0.9.0--RC6-blue)](https://viash.io)
A collection of bioinformatics pipelines to illustrate the use of biobox
(and biotools).
## Quickstart
### Requirements
To run the components and workflows included in this repository, you
need to have the following software installed:
- Bash (\>= 3.2) or an equivalent shell
- Java Development Kit (\>= 12)
- Docker
- Viash (\>= 0.6.7)
- Nextflow (\>= 21.04)
### Cloning the repository
To clone this repository to your local machine, copy the URL of the
forked repository by clicking the green “Code” button and selecting
HTTPS or SSH. In your terminal or command prompt, navigate to the
directory where you want to clone the repository and enter the following
command:
``` bash
git clone <copied_url> playground
cd playground
```
### Test dataset
You will also need to download the test resources by running the
following command. From the repository root, run:
``` bash
./test_data.sh
```
This will create the `test_data` folder and a file called
`params_file.yaml`; the latter can be used to run the worklow with the
generated test data.
### Building
Before running the workflow, the viash components need to be build and
the docker images generated.
``` bash
viash ns build --parallel --setup cachedbuild
```
> [!NOTE]
>
> The `--setup cachedbuild` enables building the docker images.
You will now see a `target` folder inside the root of the repository.
### Testing the workflow
To use the workflow with test data, use the following command (from the
root of the repository):
``` bash
nextflow run . -main-script ./target/nextflow/mapping_and_qc/main.nf \
-params-file ./params_file.yaml \
-profile docker \
-c ./target/nextflow/mapping_and_qc/nextflow.config
```
The output will be written to the folder `test_run_output`, as specified
in the `publish_dir` argument in the `params_file.yaml`.
## Support and Community
For support, questions, or to join our community:
- **Issues**: Submit questions or issues via the [GitHub issue
tracker](https://github.com/viash-hub/playground/issues).
- **Discussions**: Join our discussions via [GitHub
Discussions](https://github.com/viash-hub/playground/discussions).

84
README.qmd Normal file
View File

@@ -0,0 +1,84 @@
---
format: gfm
---
```{r setup, include=FALSE}
project <- yaml::read_yaml("_viash.yaml")
```
# 🛝📦 `r project$name`
[![GitHub](https://img.shields.io/badge/GitHub-viash--hub%2F`r project$name`-blue)](`r project$links$repository`)
[![Viash version](https://img.shields.io/badge/Viash-v`r gsub("-", "--", project$viash_version)`-blue)](https://viash.io)
`r project$description`
## Quickstart
### Requirements
To run the components and workflows included in this repository, you need to have the following software installed:
* Bash (>= 3.2) or an equivalent shell
* Java Development Kit (>= 12)
* Docker
* Viash (>= 0.6.7)
* Nextflow (>= 21.04)
### Cloning the repository
To clone this repository to your local machine, copy the URL of the forked repository by clicking the green "Code" button and selecting HTTPS or SSH.
In your terminal or command prompt, navigate to the directory where you want to clone the repository and enter the following command:
```{bash}
#| eval: false
git clone <copied_url> playground
cd playground
```
### Test dataset
You will also need to download the test resources by running the following command.
From the repository root, run:
```{bash}
#| eval: false
./test_data.sh
```
This will create the `test_data` folder and a file called `params_file.yaml`; the latter can be
used to run the worklow with the generated test data.
### Building
Before running the workflow, the viash components need to be build and the docker images generated.
```{bash}
#| eval: false
viash ns build --parallel --setup cachedbuild
```
::: {.callout-note}
The `--setup cachedbuild` enables building the docker images.
:::
You will now see a `target` folder inside the root of the repository.
### Testing the workflow
To use the workflow with test data, use the following command (from the root of the repository):
```{bash}
#| eval: false
nextflow run . -main-script ./target/nextflow/mapping_and_qc/main.nf \
-params-file ./params_file.yaml \
-profile docker \
-c ./target/nextflow/mapping_and_qc/nextflow.config
```
The output will be written to the folder `test_run_output`, as specified in the `publish_dir` argument
in the `params_file.yaml`.
## Support and Community
For support, questions, or to join our community:
- **Issues**: Submit questions or issues via the [GitHub issue tracker](`r project$links$issue_tracker`).
- **Discussions**: Join our discussions via [GitHub Discussions](`r project$links$repository`/discussions).

View File

@@ -1 +1,21 @@
docker.fixOwnership = true
process.container = "nextflow/nextflow:21.04.3"
docker {
enabled = true
fixOwnership = true
}
process {
memory = 1.GB
cpus = 1
withLabel: singlecpu { cpus = 1 }
withLabel: lowcpu { cpus = 4 }
withLabel: midcpu { cpus = 8 }
withLabel: highcpu { cpus = 10 }
withLabel: lowmem { memory = 5.GB }
withLabel: midmem { memory = 8.GB }
withLabel: highmem { memory = 25.GB }
}

View File

@@ -31,8 +31,6 @@ resources:
dependencies:
- name: cutadapt
repository: bb
- name: pear
repository: bb
- name: falco
repository: bb
- name: multiqc
@@ -46,8 +44,7 @@ repositories:
- name: bb
type: vsh
repo: vsh/biobox
tag: v0.1
tag: main
runners:
- type: nextflow

View File

@@ -17,14 +17,18 @@ workflow run_wf {
},
toState: [
"output_falco": "outdir",
]
],
directives: [label: ["lowmem", "lowcpu"]]
)
| niceView()
| cutadapt.run(
fromState: {id, state ->
[
"input": state.input_r1,
"input_r2": state.input_r2,
"quality_cutoff": "20", // Could make this a parameter
"quality_cutoff": "30", // Could make this a parameter
"quality_cutoff_r2": "30", // Could make this a parameter
"minimum_length": "60:60", // Could make this a parameter
"adapter": "CTGTCTCTTATACACATCT", // Could make this a parameter
"adapter_r2": "CTGTCTCTTATACACATCT", // Could make this a parameter
"output": "*.fastq",
@@ -34,29 +38,23 @@ workflow run_wf {
def newKeys = [
"trimmed_r1": output_state["output"][0],
"trimmed_r2": output_state["output"][1],
"output_cutadapt": output_state["output"]
]
def new_state = state + newKeys
return new_state
}
)
| pear.run(
fromState: [
"forward_fastq": "trimmed_r1",
"reverse_fastq": "trimmed_r2",
],
toState: [
"output_pear": "assembled",
]
},
directives: [label: ["midmem", "midmem"]]
)
| star_align_reads.run(
fromState: [
"input": "output_pear",
"genomeDir": "reference",
"input": "trimmed_r1",
"input_r2": "trimmed_r2",
"genome_dir": "reference",
],
toState: [
"output_star": "aligned_reads",
]
],
directives: [label: ["highmem", "midcpu"]]
)
| samtools_stats.run(
fromState: [
@@ -64,7 +62,9 @@ workflow run_wf {
],
toState: [
"output_samtools_stats": "output",
]
],
directives: [label: ["midmem", "lowcpu"]]
)
| toSortedList()
| map { events ->
@@ -81,7 +81,9 @@ workflow run_wf {
],
toState: [
"multiqc_output": "output_report",
]
],
directives: [label: ["midmem", "lowcpu"]]
)
| setState(["multiqc_output", "_meta"])

View File

@@ -1,5 +1,18 @@
name: "cutadapt"
version: "v0.1.0"
version: "main"
authors:
- name: "Toni Verbeiren"
roles:
- "author"
- "maintainer"
info:
links:
github: "tverbeiren"
linkedin: "verbeiren"
organizations:
- name: "Data Intuitive"
href: "https://www.data-intuitive.com"
role: "Data Scientist and CEO"
argument_groups:
- name: "Specify Adapters for R1"
arguments:
@@ -238,7 +251,7 @@ argument_groups:
direction: "input"
multiple: false
multiple_sep: ";"
- type: "boolean_false"
- type: "boolean_true"
name: "--no_indels"
description: "Allow only mismatches in alignments.\n"
info: null
@@ -273,7 +286,7 @@ argument_groups:
description: "Interpret IUPAC wildcards in reads.\n"
info: null
direction: "input"
- type: "boolean_false"
- type: "boolean_true"
name: "--no_match_adapter_wildcards"
description: "Do not interpret IUPAC wildcards in adapters.\n"
info: null
@@ -305,6 +318,26 @@ argument_groups:
\ If match is on reverse-complemented version,\noutput that one.\n"
info: null
direction: "input"
- name: "Demultiplexing options"
arguments:
- type: "string"
name: "--demultiplex_mode"
description: "Enable demultiplexing and set the mode for it.\nWith mode 'unique_dual',\
\ adapters from the first and second read are used,\nand the indexes from the\
\ reads are only used in pairs. This implies\n--pair_adapters.\nEnabling mode\
\ 'combinatorial_dual' allows all combinations of the sets of indexes\non R1\
\ and R2. It is necessary to write each read pair to an output\nfile depending\
\ on the adapters found on both R1 and R2.\nMode 'single', uses indexes or barcodes\
\ located at the 5'\nend of the R1 read (single). \n"
info: null
required: false
choices:
- "single"
- "unique_dual"
- "combinatorial_dual"
direction: "input"
multiple: false
multiple_sep: ";"
- name: "Read modifications"
arguments:
- type: "integer"
@@ -685,7 +718,7 @@ engines:
id: "docker"
image: "python:3.12"
target_registry: "images.viash-hub.com"
target_tag: "v0.1.0"
target_tag: "main"
namespace_separator: "/"
setup:
- type: "python"
@@ -706,22 +739,23 @@ build_info:
engine: "docker|native"
output: "target/nextflow/cutadapt"
executable: "target/nextflow/cutadapt/main.nf"
viash_version: "0.9.0-RC6"
git_commit: "b84b29747d0635f2ac83ea63b496be9a9edb6724"
git_remote: "https://github.com/viash-hub/biobox"
viash_version: "0.9.0"
git_commit: "952ff0843093b538cbfd6fefdecf2e7a0bc9e70b"
git_remote: "https://x-access-token:ghs_EwAUAMYJ0K4VBHlAEMs4ZP2OyQYqJM0PSfEO@github.com/viash-hub/biobox"
git_tag: "v0.2.0-27-g952ff08"
package_config:
name: "biobox"
version: "v0.1.0"
version: "main"
description: "A collection of bioinformatics tools for working with sequence data.\n"
info: null
viash_version: "0.9.0-RC6"
viash_version: "0.9.0"
source: "src"
target: "target"
config_mods:
- ".requirements.commands := ['ps']\n"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v0.1.0'"
- ".engines[.type == 'docker'].target_tag := 'main'"
keywords:
- "bioinformatics"
- "modules"

View File

@@ -1,13 +1,16 @@
// cutadapt v0.1.0
// cutadapt main
//
// This wrapper script is auto-generated by viash 0.9.0-RC6 and is thus a
// derivative work thereof. This software comes with ABSOLUTELY NO WARRANTY from
// Data Intuitive.
// This wrapper script is auto-generated by viash 0.9.0 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
// Intuitive.
//
// The component may contain files which fall under a different license. The
// authors of this component should specify the license in the header of such
// files, or include a separate license file detailing the licenses of all included
// files.
//
// Component authors:
// * Toni Verbeiren (author, maintainer)
////////////////////////////
// VDSL3 helper functions //
@@ -760,8 +763,11 @@ def runEach(Map args) {
def fromState_ = args.fromState
def toState_ = args.toState
def filter_ = args.filter
def runIf_ = args.runIf
def id_ = args.id
assert !runIf_ || runIf_ instanceof Closure: "runEach: must pass a Closure to runIf."
workflow runEachWf {
take: input_ch
main:
@@ -783,7 +789,20 @@ def runEach(Map args) {
[new_id] + tup.drop(1)
}
: filter_ch
def data_ch = id_ch | map{tup ->
def chPassthrough = null
def chRun = null
if (runIf_) {
def idRunIfBranch = id_ch.branch{ tup ->
run: runIf_(tup[0], tup[1], comp_)
passthrough: true
}
chPassthrough = idRunIfBranch.passthrough
chRun = idRunIfBranch.run
} else {
chRun = id_ch
chPassthrough = Channel.empty()
}
def data_ch = chRun | map{tup ->
def new_data = tup[1]
if (fromState_ instanceof Map) {
new_data = fromState_.collectEntries{ key0, key1 ->
@@ -821,8 +840,11 @@ def runEach(Map args) {
[tup[0], new_state] + tup.drop(3)
}
: out_ch
def return_ch = post_ch
| concat(chPassthrough)
post_ch
return_ch
}
// mix all results
@@ -1598,8 +1620,8 @@ def findStates(Map params, Map config) {
// construct renameMap
if (args.rename_keys) {
def renameMap = args.rename_keys.collectEntries{renameString ->
def split = renameString.split(";")
assert split.size() == 2: "Argument 'rename_keys' should be of the form 'newKey:oldKey,newKey:oldKey'"
def split = renameString.split(":")
assert split.size() == 2: "Argument 'rename_keys' should be of the form 'newKey:oldKey', or 'newKey:oldKey;newKey:oldKey' in case of multiple values"
split
}
@@ -1709,7 +1731,9 @@ def publishStates(Map args) {
def yamlFilename = yamlTemplate_
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
// TODO: do the pathnames in state_ match up with the outputFilenames_?
@@ -1780,7 +1804,9 @@ def publishStatesByConfig(Map args) {
def yamlTemplate = params.containsKey("output_state") ? params.output_state : '$id.$key.state.yaml'
def yamlFilename = yamlTemplate
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
def yamlDir = java.nio.file.Paths.get(yamlFilename).getParent()
// the processed state is a list of [key, value, inputPath, outputFilename] tuples, where
@@ -1822,7 +1848,9 @@ def publishStatesByConfig(Map args) {
// instantiate the template
def filename = filenameTemplate
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
if (par.multiple) {
// if the parameter is multiple: true, the filename
// should contain a wildcard '*' that is replaced with
@@ -2626,30 +2654,31 @@ def workflowFactory(Map args, Map defaultWfArgs, Map meta) {
tuple
}
def chModifiedFiltered = workflowArgs.filter ?
chModified | filter{workflowArgs.filter(it)} :
chModified
def chRun = null
def chPassthrough = null
if (workflowArgs.runIf) {
def runIfBranch = chModifiedFiltered.branch{ tup ->
def runIfBranch = chModified.branch{ tup ->
run: workflowArgs.runIf(tup[0], tup[1])
passthrough: true
}
chRun = runIfBranch.run
chPassthrough = runIfBranch.passthrough
} else {
chRun = chModifiedFiltered
chRun = chModified
chPassthrough = Channel.empty()
}
def chRunFiltered = workflowArgs.filter ?
chRun | filter{workflowArgs.filter(it)} :
chRun
def chArgs = workflowArgs.fromState ?
chRun | map{
chRunFiltered | map{
def new_data = workflowArgs.fromState(it.take(2))
[it[0], new_data]
} :
chRun | map {tup -> tup.take(2)}
chRunFiltered | map {tup -> tup.take(2)}
// fill in defaults
def chArgsWithDefaults = chArgs
@@ -2720,7 +2749,7 @@ def workflowFactory(Map args, Map defaultWfArgs, Map meta) {
// | view{"chInitialOutput: ${it.take(3)}"}
// join the output [prev_id, new_id, output] with the previous state [prev_id, state, ...]
def chNewState = safeJoin(chInitialOutput, chModifiedFiltered, key_)
def chNewState = safeJoin(chInitialOutput, chRunFiltered, key_)
// input tuple format: [join_id, id, output, prev_state, ...]
// output tuple format: [join_id, id, new_state, ...]
| map{ tup ->
@@ -2779,7 +2808,29 @@ meta = [
"resources_dir": moduleDir.toRealPath().normalize(),
"config": processConfig(readJsonBlob('''{
"name" : "cutadapt",
"version" : "v0.1.0",
"version" : "main",
"authors" : [
{
"name" : "Toni Verbeiren",
"roles" : [
"author",
"maintainer"
],
"info" : {
"links" : {
"github" : "tverbeiren",
"linkedin" : "verbeiren"
},
"organizations" : [
{
"name" : "Data Intuitive",
"href" : "https://www.data-intuitive.com",
"role" : "Data Scientist and CEO"
}
]
}
}
],
"argument_groups" : [
{
"name" : "Specify Adapters for R1",
@@ -3012,7 +3063,7 @@ meta = [
"multiple_sep" : ";"
},
{
"type" : "boolean_false",
"type" : "boolean_true",
"name" : "--no_indels",
"description" : "Allow only mismatches in alignments.\n",
"direction" : "input"
@@ -3054,7 +3105,7 @@ meta = [
"direction" : "input"
},
{
"type" : "boolean_false",
"type" : "boolean_true",
"name" : "--no_match_adapter_wildcards",
"description" : "Do not interpret IUPAC wildcards in adapters.\n",
"direction" : "input"
@@ -3089,6 +3140,25 @@ meta = [
}
]
},
{
"name" : "Demultiplexing options",
"arguments" : [
{
"type" : "string",
"name" : "--demultiplex_mode",
"description" : "Enable demultiplexing and set the mode for it.\nWith mode 'unique_dual', adapters from the first and second read are used,\nand the indexes from the reads are only used in pairs. This implies\n--pair_adapters.\nEnabling mode 'combinatorial_dual' allows all combinations of the sets of indexes\non R1 and R2. It is necessary to write each read pair to an output\nfile depending on the adapters found on both R1 and R2.\nMode 'single', uses indexes or barcodes located at the 5'\nend of the R1 read (single). \n",
"required" : false,
"choices" : [
"single",
"unique_dual",
"combinatorial_dual"
],
"direction" : "input",
"multiple" : false,
"multiple_sep" : ";"
}
]
},
{
"name" : "Read modifications",
"arguments" : [
@@ -3519,7 +3589,7 @@ meta = [
"id" : "docker",
"image" : "python:3.12",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v0.1.0",
"target_tag" : "main",
"namespace_separator" : "/",
"setup" : [
{
@@ -3548,22 +3618,23 @@ meta = [
"runner" : "nextflow",
"engine" : "docker|native",
"output" : "target/nextflow/cutadapt",
"viash_version" : "0.9.0-RC6",
"git_commit" : "b84b29747d0635f2ac83ea63b496be9a9edb6724",
"git_remote" : "https://github.com/viash-hub/biobox"
"viash_version" : "0.9.0",
"git_commit" : "952ff0843093b538cbfd6fefdecf2e7a0bc9e70b",
"git_remote" : "https://x-access-token:ghs_EwAUAMYJ0K4VBHlAEMs4ZP2OyQYqJM0PSfEO@github.com/viash-hub/biobox",
"git_tag" : "v0.2.0-27-g952ff08"
},
"package_config" : {
"name" : "biobox",
"version" : "v0.1.0",
"version" : "main",
"description" : "A collection of bioinformatics tools for working with sequence data.\n",
"viash_version" : "0.9.0-RC6",
"viash_version" : "0.9.0",
"source" : "src",
"target" : "target",
"config_mods" : [
".requirements.commands := ['ps']\n",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v0.1.0'"
".engines[.type == 'docker'].target_tag := 'main'"
],
"keywords" : [
"bioinformatics",
@@ -3618,6 +3689,7 @@ $( if [ ! -z ${VIASH_PAR_MATCH_READ_WILDCARDS+x} ]; then echo "${VIASH_PAR_MATCH
$( if [ ! -z ${VIASH_PAR_NO_MATCH_ADAPTER_WILDCARDS+x} ]; then echo "${VIASH_PAR_NO_MATCH_ADAPTER_WILDCARDS}" | sed "s#'#'\\"'\\"'#g;s#.*#par_no_match_adapter_wildcards='&'#" ; else echo "# par_no_match_adapter_wildcards="; fi )
$( if [ ! -z ${VIASH_PAR_ACTION+x} ]; then echo "${VIASH_PAR_ACTION}" | sed "s#'#'\\"'\\"'#g;s#.*#par_action='&'#" ; else echo "# par_action="; fi )
$( if [ ! -z ${VIASH_PAR_REVCOMP+x} ]; then echo "${VIASH_PAR_REVCOMP}" | sed "s#'#'\\"'\\"'#g;s#.*#par_revcomp='&'#" ; else echo "# par_revcomp="; fi )
$( if [ ! -z ${VIASH_PAR_DEMULTIPLEX_MODE+x} ]; then echo "${VIASH_PAR_DEMULTIPLEX_MODE}" | sed "s#'#'\\"'\\"'#g;s#.*#par_demultiplex_mode='&'#" ; else echo "# par_demultiplex_mode="; fi )
$( if [ ! -z ${VIASH_PAR_CUT+x} ]; then echo "${VIASH_PAR_CUT}" | sed "s#'#'\\"'\\"'#g;s#.*#par_cut='&'#" ; else echo "# par_cut="; fi )
$( if [ ! -z ${VIASH_PAR_CUT_R2+x} ]; then echo "${VIASH_PAR_CUT_R2}" | sed "s#'#'\\"'\\"'#g;s#.*#par_cut_r2='&'#" ; else echo "# par_cut_r2="; fi )
$( if [ ! -z ${VIASH_PAR_NEXTSEQ_TRIM+x} ]; then echo "${VIASH_PAR_NEXTSEQ_TRIM}" | sed "s#'#'\\"'\\"'#g;s#.*#par_nextseq_trim='&'#" ; else echo "# par_nextseq_trim="; fi )
@@ -3754,9 +3826,9 @@ debug
# Input arguments
###########################################################
echo ">> Parsing input arguments"
[[ "\\$par_no_indels" == "true" ]] && unset par_no_indels
[[ "\\$par_no_indels" == "false" ]] && unset par_no_indels
[[ "\\$par_match_read_wildcards" == "false" ]] && unset par_match_read_wildcards
[[ "\\$par_no_match_adapter_wildcards" == "true" ]] && unset par_no_match_adapter_wildcards
[[ "\\$par_no_match_adapter_wildcards" == "false" ]] && unset par_no_match_adapter_wildcards
[[ "\\$par_revcomp" == "false" ]] && unset par_revcomp
input_args=\\$(echo \\\\
@@ -3766,7 +3838,7 @@ input_args=\\$(echo \\\\
\\${par_overlap:+--overlap "\\${par_overlap}"} \\\\
\\${par_match_read_wildcards:+--match-read-wildcards} \\\\
\\${par_no_match_adapter_wildcards:+--no-match-adapter-wildcards} \\\\
\\${par_action:+--action "\\${par_action}"} \\\\
\\${par_action:+--action="\\${par_action}"} \\\\
\\${par_revcomp:+--revcomp} \\\\
)
debug "Arguments to cutadapt:"
@@ -3785,7 +3857,7 @@ mod_args=\\$(echo \\\\
\\${par_cut_r2:+--cut_r2 "\\${par_cut_r2}"} \\\\
\\${par_nextseq_trim:+--nextseq-trim "\\${par_nextseq_trim}"} \\\\
\\${par_quality_cutoff:+--quality-cutoff "\\${par_quality_cutoff}"} \\\\
\\${par_quality_cutoff_r2:+--quality-cutoff_r2 "\\${par_quality_cutoff_r2}"} \\\\
\\${par_quality_cutoff_r2:+-Q "\\${par_quality_cutoff_r2}"} \\\\
\\${par_quality_base:+--quality-base "\\${par_quality_base}"} \\\\
\\${par_poly_a:+--poly-a} \\\\
\\${par_length:+--length "\\${par_length}"} \\\\
@@ -3854,14 +3926,35 @@ else
ext="fasta"
fi
if [ \\$mode = "se" ]; then
demultiplex_mode="\\$par_demultiplex_mode"
if [[ \\$mode == "se" ]]; then
if [[ "\\$demultiplex_mode" == "unique_dual" ]] || [[ "\\$demultiplex_mode" == "combinatorial_dual" ]]; then
echo "Demultiplexing dual indexes is not possible with single-end data."
exit 1
fi
prefix="trimmed_"
if [[ ! -z "\\$demultiplex_mode" ]]; then
prefix="{name}_"
fi
output_args=\\$(echo \\\\
--output "\\$output_dir/{name}_001.\\$ext" \\\\
--output "\\$output_dir/\\${prefix}001.\\$ext" \\\\
)
else
demultiplex_indicator_r1='{name}_'
demultiplex_indicator_r2=\\$demultiplex_indicator_r1
if [[ "\\$demultiplex_mode" == "combinatorial_dual" ]]; then
demultiplex_indicator_r1='{name1}_{name2}_'
demultiplex_indicator_r2='{name1}_{name2}_'
fi
prefix_r1="trimmed_"
prefix_r2="trimmed_"
if [[ ! -z "\\$demultiplex_mode" ]]; then
prefix_r1=\\$demultiplex_indicator_r1
prefix_r2=\\$demultiplex_indicator_r2
fi
output_args=\\$(echo \\\\
--output "\\$output_dir/{name}_R1_001.\\$ext" \\\\
--paired-output "\\$output_dir/{name}_R2_001.\\$ext" \\\\
--output "\\$output_dir/\\${prefix_r1}R1_001.\\$ext" \\\\
--paired-output "\\$output_dir/\\${prefix_r2}R2_001.\\$ext" \\\\
)
fi
@@ -3973,7 +4066,11 @@ def vdsl3WorkflowFactory(Map args, Map meta, String rawScript) {
val = val.join(par.multiple_sep)
}
if (par.direction == "output" && par.type == "file") {
val = val.replaceAll('\\$id', id).replaceAll('\\$key', key)
val = val
.replaceAll('\\$id', id)
.replaceAll('\\$\\{id\\}', id)
.replaceAll('\\$key', key)
.replaceAll('\\$\\{key\\}', key)
}
[parName, val]
}
@@ -4104,7 +4201,8 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def createParentStr = meta.config.allArguments
.findAll { it.type == "file" && it.direction == "output" && it.create_parent }
.collect { par ->
"\${ args.containsKey(\"${par.plainName}\") ? \"mkdir_parent \\\"\" + (args[\"${par.plainName}\"] instanceof String ? args[\"${par.plainName}\"] : args[\"${par.plainName}\"].join('\" \"')) + \"\\\"\" : \"\" }"
def contents = "args[\"${par.plainName}\"] instanceof List ? args[\"${par.plainName}\"].join('\" \"') : args[\"${par.plainName}\"]"
"\${ args.containsKey(\"${par.plainName}\") ? \"mkdir_parent '\" + escapeText(${contents}) + \"'\" : \"\" }"
}
.join("\n")
@@ -4112,8 +4210,8 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def inputFileExports = meta.config.allArguments
.findAll { it.type == "file" && it.direction.toLowerCase() == "input" }
.collect { par ->
def viash_par_contents = "(viash_par_${par.plainName} instanceof List ? viash_par_${par.plainName}.join(\"${par.multiple_sep}\") : viash_par_${par.plainName})"
"\n\${viash_par_${par.plainName}.empty ? \"\" : \"export VIASH_PAR_${par.plainName.toUpperCase()}=\\\"\" + ${viash_par_contents} + \"\\\"\"}"
def contents = "viash_par_${par.plainName} instanceof List ? viash_par_${par.plainName}.join(\"${par.multiple_sep}\") : viash_par_${par.plainName}"
"\n\${viash_par_${par.plainName}.empty ? \"\" : \"export VIASH_PAR_${par.plainName.toUpperCase()}='\" + escapeText(${contents}) + \"'\"}"
}
// NOTE: if using docker, use /tmp instead of tmpDir!
@@ -4150,6 +4248,7 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def procStr =
"""nextflow.enable.dsl=2
|
|def escapeText = { s -> s.toString().replaceAll("'", "'\\\"'\\\"'") }
|process $procKey {$drctvStrs
|input:
| tuple val(id)$inputPaths, val(args), path(resourcesDir, stageAs: ".viash_meta_resources")
@@ -4161,10 +4260,9 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
|$stub
|\"\"\"
|script:$assertStr
|def escapeText = { s -> s.toString().replaceAll('([`"])', '\\\\\\\\\$1') }
|def parInject = args
| .findAll{key, value -> value != null}
| .collect{key, value -> "export VIASH_PAR_\${key.toUpperCase()}=\\\"\${escapeText(value)}\\\""}
| .collect{key, value -> "export VIASH_PAR_\${key.toUpperCase()}='\${escapeText(value)}'"}
| .join("\\n")
|\"\"\"
|# meta exports
@@ -4249,7 +4347,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/biobox/cutadapt",
"tag" : "v0.1.0"
"tag" : "main"
},
"tag" : "$id"
}'''),

View File

@@ -2,8 +2,9 @@ manifest {
name = 'cutadapt'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v0.1.0'
version = 'main'
description = 'Cutadapt removes adapter sequences from high-throughput sequencing reads.\n'
author = 'Toni Verbeiren'
}
process.container = 'nextflow/bash:latest'

View File

@@ -17,8 +17,8 @@
"adapter": {
"type":
"string",
"description": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter ligated to the 3\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter ligated to the 3\u0027 end (paired data:\nof the first read). The adapter and subsequent bases are\ntrimmed. If a \u0027$\u0027 character is appended (\u0027anchoring\u0027), the\nadapter is only found if it is a suffix of the read.\n"
"description": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter ligated to the 3\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter ligated to the 3\u0027 end (paired data:\nof the first read). The adapter and subsequent bases are\ntrimmed. If a \u0027$\u0027 character is appended (\u0027anchoring\u0027), the\nadapter is only found if it is a suffix of the read.\n"
}
@@ -27,8 +27,8 @@
"front": {
"type":
"string",
"description": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter ligated to the 5\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter ligated to the 5\u0027 end (paired data:\nof the first read). The adapter and any preceding bases\nare trimmed. Partial matches at the 5\u0027 end are allowed. If\na \u0027^\u0027 character is prepended (\u0027anchoring\u0027), the adapter is\nonly found if it is a prefix of the read.\n"
"description": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter ligated to the 5\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter ligated to the 5\u0027 end (paired data:\nof the first read). The adapter and any preceding bases\nare trimmed. Partial matches at the 5\u0027 end are allowed. If\na \u0027^\u0027 character is prepended (\u0027anchoring\u0027), the adapter is\nonly found if it is a prefix of the read.\n"
}
@@ -37,8 +37,8 @@
"anywhere": {
"type":
"string",
"description": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter that may be ligated to the 5\u0027 or 3\u0027\nend (paired data: of the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter that may be ligated to the 5\u0027 or 3\u0027\nend (paired data: of the first read). Both types of\nmatches as described under -a and -g are allowed. If the\nfirst base of the read is part of the match, the behavior\nis as with -g, otherwise as with -a. This option is mostly\nfor rescuing failed library preparations - do not use if\nyou know which end your adapter was ligated to!\n"
"description": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter that may be ligated to the 5\u0027 or 3\u0027\nend (paired data: of the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter that may be ligated to the 5\u0027 or 3\u0027\nend (paired data: of the first read). Both types of\nmatches as described under -a and -g are allowed. If the\nfirst base of the read is part of the match, the behavior\nis as with -g, otherwise as with -a. This option is mostly\nfor rescuing failed library preparations - do not use if\nyou know which end your adapter was ligated to!\n"
}
@@ -57,8 +57,8 @@
"adapter_fasta": {
"type":
"string",
"description": "Type: List of `file`, multiple_sep: `\":\"`. Fasta file containing sequences of an adapter ligated to the 3\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `file`, multiple_sep: `\":\"`. Fasta file containing sequences of an adapter ligated to the 3\u0027 end (paired data:\nof the first read). The adapter and subsequent bases are\ntrimmed. If a \u0027$\u0027 character is appended (\u0027anchoring\u0027), the\nadapter is only found if it is a suffix of the read.\n"
"description": "Type: List of `file`, multiple_sep: `\";\"`. Fasta file containing sequences of an adapter ligated to the 3\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `file`, multiple_sep: `\";\"`. Fasta file containing sequences of an adapter ligated to the 3\u0027 end (paired data:\nof the first read). The adapter and subsequent bases are\ntrimmed. If a \u0027$\u0027 character is appended (\u0027anchoring\u0027), the\nadapter is only found if it is a suffix of the read.\n"
}
@@ -97,8 +97,8 @@
"adapter_r2": {
"type":
"string",
"description": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter ligated to the 3\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter ligated to the 3\u0027 end (paired data:\nof the first read). The adapter and subsequent bases are\ntrimmed. If a \u0027$\u0027 character is appended (\u0027anchoring\u0027), the\nadapter is only found if it is a suffix of the read.\n"
"description": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter ligated to the 3\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter ligated to the 3\u0027 end (paired data:\nof the first read). The adapter and subsequent bases are\ntrimmed. If a \u0027$\u0027 character is appended (\u0027anchoring\u0027), the\nadapter is only found if it is a suffix of the read.\n"
}
@@ -107,8 +107,8 @@
"front_r2": {
"type":
"string",
"description": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter ligated to the 5\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter ligated to the 5\u0027 end (paired data:\nof the first read). The adapter and any preceding bases\nare trimmed. Partial matches at the 5\u0027 end are allowed. If\na \u0027^\u0027 character is prepended (\u0027anchoring\u0027), the adapter is\nonly found if it is a prefix of the read.\n"
"description": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter ligated to the 5\u0027 end (paired data:\nof the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter ligated to the 5\u0027 end (paired data:\nof the first read). The adapter and any preceding bases\nare trimmed. Partial matches at the 5\u0027 end are allowed. If\na \u0027^\u0027 character is prepended (\u0027anchoring\u0027), the adapter is\nonly found if it is a prefix of the read.\n"
}
@@ -117,8 +117,8 @@
"anywhere_r2": {
"type":
"string",
"description": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter that may be ligated to the 5\u0027 or 3\u0027\nend (paired data: of the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\":\"`. Sequence of an adapter that may be ligated to the 5\u0027 or 3\u0027\nend (paired data: of the first read). Both types of\nmatches as described under -a and -g are allowed. If the\nfirst base of the read is part of the match, the behavior\nis as with -g, otherwise as with -a. This option is mostly\nfor rescuing failed library preparations - do not use if\nyou know which end your adapter was ligated to!\n"
"description": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter that may be ligated to the 5\u0027 or 3\u0027\nend (paired data: of the first read)",
"help_text": "Type: List of `string`, multiple_sep: `\";\"`. Sequence of an adapter that may be ligated to the 5\u0027 or 3\u0027\nend (paired data: of the first read). Both types of\nmatches as described under -a and -g are allowed. If the\nfirst base of the read is part of the match, the behavior\nis as with -g, otherwise as with -a. This option is mostly\nfor rescuing failed library preparations - do not use if\nyou know which end your adapter was ligated to!\n"
}
@@ -180,7 +180,7 @@
"description": "Type: `boolean_true`, default: `false`. Treat adapters given with -a/-A etc",
"help_text": "Type: `boolean_true`, default: `false`. Treat adapters given with -a/-A etc. as pairs. Either both\nor none are removed from each read pair.\n"
,
"default": "False"
"default":false
}
@@ -203,7 +203,7 @@
"description": "Type: `boolean_true`, default: `false`. Read and/or write interleaved paired-end reads",
"help_text": "Type: `boolean_true`, default: `false`. Read and/or write interleaved paired-end reads.\n"
,
"default": "False"
"default":false
}
@@ -251,10 +251,10 @@
"no_indels": {
"type":
"boolean",
"description": "Type: `boolean_false`, default: `true`. Allow only mismatches in alignments",
"help_text": "Type: `boolean_false`, default: `true`. Allow only mismatches in alignments.\n"
"description": "Type: `boolean_true`, default: `false`. Allow only mismatches in alignments",
"help_text": "Type: `boolean_true`, default: `false`. Allow only mismatches in alignments.\n"
,
"default": "True"
"default":false
}
@@ -285,7 +285,7 @@
"description": "Type: `boolean_true`, default: `false`. Interpret IUPAC wildcards in reads",
"help_text": "Type: `boolean_true`, default: `false`. Interpret IUPAC wildcards in reads.\n"
,
"default": "False"
"default":false
}
@@ -293,10 +293,10 @@
"no_match_adapter_wildcards": {
"type":
"boolean",
"description": "Type: `boolean_false`, default: `true`. Do not interpret IUPAC wildcards in adapters",
"help_text": "Type: `boolean_false`, default: `true`. Do not interpret IUPAC wildcards in adapters.\n"
"description": "Type: `boolean_true`, default: `false`. Do not interpret IUPAC wildcards in adapters",
"help_text": "Type: `boolean_true`, default: `false`. Do not interpret IUPAC wildcards in adapters.\n"
,
"default": "True"
"default":false
}
@@ -319,7 +319,29 @@
"description": "Type: `boolean_true`, default: `false`. Check both the read and its reverse complement for adapter\nmatches",
"help_text": "Type: `boolean_true`, default: `false`. Check both the read and its reverse complement for adapter\nmatches. If match is on reverse-complemented version,\noutput that one.\n"
,
"default": "False"
"default":false
}
}
},
"demultiplexing options" : {
"title": "Demultiplexing options",
"type": "object",
"description": "No description",
"properties": {
"demultiplex_mode": {
"type":
"string",
"description": "Type: `string`, choices: ``single`, `unique_dual`, `combinatorial_dual``. Enable demultiplexing and set the mode for it",
"help_text": "Type: `string`, choices: ``single`, `unique_dual`, `combinatorial_dual``. Enable demultiplexing and set the mode for it.\nWith mode \u0027unique_dual\u0027, adapters from the first and second read are used,\nand the indexes from the reads are only used in pairs. This implies\n--pair_adapters.\nEnabling mode \u0027combinatorial_dual\u0027 allows all combinations of the sets of indexes\non R1 and R2. It is necessary to write each read pair to an output\nfile depending on the adapters found on both R1 and R2.\nMode \u0027single\u0027, uses indexes or barcodes located at the 5\u0027\nend of the R1 read (single). \n",
"enum": ["single", "unique_dual", "combinatorial_dual"]
}
@@ -337,8 +359,8 @@
"cut": {
"type":
"string",
"description": "Type: List of `integer`, multiple_sep: `\":\"`. Remove LEN bases from each read (or R1 if paired; use --cut_r2\noption for R2)",
"help_text": "Type: List of `integer`, multiple_sep: `\":\"`. Remove LEN bases from each read (or R1 if paired; use --cut_r2\noption for R2). If LEN is positive, remove bases from the\nbeginning. If LEN is negative, remove bases from the end.\nCan be used twice if LENs have different signs. Applied\n*before* adapter trimming.\n"
"description": "Type: List of `integer`, multiple_sep: `\";\"`. Remove LEN bases from each read (or R1 if paired; use --cut_r2\noption for R2)",
"help_text": "Type: List of `integer`, multiple_sep: `\";\"`. Remove LEN bases from each read (or R1 if paired; use --cut_r2\noption for R2). If LEN is positive, remove bases from the\nbeginning. If LEN is negative, remove bases from the end.\nCan be used twice if LENs have different signs. Applied\n*before* adapter trimming.\n"
}
@@ -347,8 +369,8 @@
"cut_r2": {
"type":
"string",
"description": "Type: List of `integer`, multiple_sep: `\":\"`. Remove LEN bases from each read (for R2)",
"help_text": "Type: List of `integer`, multiple_sep: `\":\"`. Remove LEN bases from each read (for R2). If LEN is positive, remove bases from the\nbeginning. If LEN is negative, remove bases from the end.\nCan be used twice if LENs have different signs. Applied\n*before* adapter trimming.\n"
"description": "Type: List of `integer`, multiple_sep: `\";\"`. Remove LEN bases from each read (for R2)",
"help_text": "Type: List of `integer`, multiple_sep: `\";\"`. Remove LEN bases from each read (for R2). If LEN is positive, remove bases from the\nbeginning. If LEN is negative, remove bases from the end.\nCan be used twice if LENs have different signs. Applied\n*before* adapter trimming.\n"
}
@@ -400,7 +422,7 @@
"description": "Type: `boolean_true`, default: `false`. Trim poly-A tails",
"help_text": "Type: `boolean_true`, default: `false`. Trim poly-A tails"
,
"default": "False"
"default":false
}
@@ -421,7 +443,7 @@
"description": "Type: `boolean_true`, default: `false`. Trim N\u0027s on ends of reads",
"help_text": "Type: `boolean_true`, default: `false`. Trim N\u0027s on ends of reads."
,
"default": "False"
"default":false
}
@@ -482,7 +504,7 @@
"description": "Type: `boolean_true`, default: `false`. Change negative quality values to zero",
"help_text": "Type: `boolean_true`, default: `false`. Change negative quality values to zero."
,
"default": "False"
"default":false
}
@@ -553,7 +575,7 @@
"description": "Type: `boolean_true`, default: `false`. Discard reads that contain an adapter",
"help_text": "Type: `boolean_true`, default: `false`. Discard reads that contain an adapter. Use also -O to\navoid discarding too many randomly matching reads.\n"
,
"default": "False"
"default":false
}
@@ -564,7 +586,7 @@
"description": "Type: `boolean_true`, default: `false`. Discard reads that do not contain an adapter",
"help_text": "Type: `boolean_true`, default: `false`. Discard reads that do not contain an adapter.\n"
,
"default": "False"
"default":false
}
@@ -575,7 +597,7 @@
"description": "Type: `boolean_true`, default: `false`. Discard reads that did not pass CASAVA filtering (header\nhas :Y:)",
"help_text": "Type: `boolean_true`, default: `false`. Discard reads that did not pass CASAVA filtering (header\nhas :Y:).\n"
,
"default": "False"
"default":false
}
@@ -608,7 +630,7 @@
"description": "Type: `boolean_true`, default: `false`. Write report in JSON format to this file",
"help_text": "Type: `boolean_true`, default: `false`. Write report in JSON format to this file.\n"
,
"default": "False"
"default":false
}
@@ -616,10 +638,10 @@
"output": {
"type":
"string",
"description": "Type: List of `file`, required, default: `$id.$key.output_*.fast[a,q]`, example: `fastq/*_001.fast[a,q]`, multiple_sep: `\":\"`. Glob pattern for matching the expected output files",
"help_text": "Type: List of `file`, required, default: `$id.$key.output_*.fast[a,q]`, example: `fastq/*_001.fast[a,q]`, multiple_sep: `\":\"`. Glob pattern for matching the expected output files.\nShould include `$output_dir`.\n"
"description": "Type: List of `file`, required, default: `$id.$key.output_*.fast[a,q]`, example: `fastq/*_001.fast[a,q]`, multiple_sep: `\";\"`. Glob pattern for matching the expected output files",
"help_text": "Type: List of `file`, required, default: `$id.$key.output_*.fast[a,q]`, example: `fastq/*_001.fast[a,q]`, multiple_sep: `\";\"`. Glob pattern for matching the expected output files.\nShould include `$output_dir`.\n"
,
"default": "$id.$key.output_*.fast[a,q]"
"default":"$id.$key.output_*.fast[a,q]"
}
@@ -630,7 +652,7 @@
"description": "Type: `boolean_true`, default: `false`. Output FASTA to standard output even on FASTQ input",
"help_text": "Type: `boolean_true`, default: `false`. Output FASTA to standard output even on FASTQ input.\n"
,
"default": "False"
"default":false
}
@@ -641,7 +663,7 @@
"description": "Type: `boolean_true`, default: `false`. Write information about each read and its adapter matches\ninto info",
"help_text": "Type: `boolean_true`, default: `false`. Write information about each read and its adapter matches\ninto info.txt in the output directory.\nSee the documentation for the file format.\n"
,
"default": "False"
"default":false
}
@@ -662,7 +684,7 @@
"description": "Type: `boolean_true`, default: `false`. Print debug information",
"help_text": "Type: `boolean_true`, default: `false`. Print debug information"
,
"default": "False"
"default":false
}
@@ -726,6 +748,10 @@
"$ref": "#/definitions/input parameters"
},
{
"$ref": "#/definitions/demultiplexing options"
},
{
"$ref": "#/definitions/read modifications"
},

View File

@@ -1,5 +1,18 @@
name: "falco"
version: "v0.1.0"
version: "main"
authors:
- name: "Toni Verbeiren"
roles:
- "author"
- "maintainer"
info:
links:
github: "tverbeiren"
linkedin: "verbeiren"
organizations:
- name: "Data Intuitive"
href: "https://www.data-intuitive.com"
role: "Data Scientist and CEO"
argument_groups:
- name: "Input arguments"
arguments:
@@ -88,7 +101,7 @@ argument_groups:
info: null
direction: "input"
- type: "boolean_true"
name: "--reverse_complliment"
name: "--reverse_complement"
alternatives:
- "-r"
description: "[Falco only] The input is a \nreverse-complement. All modules will\
@@ -274,7 +287,7 @@ engines:
id: "docker"
image: "debian:trixie-slim"
target_registry: "images.viash-hub.com"
target_tag: "v0.1.0"
target_tag: "main"
namespace_separator: "/"
setup:
- type: "apt"
@@ -303,22 +316,23 @@ build_info:
engine: "docker|native"
output: "target/nextflow/falco"
executable: "target/nextflow/falco/main.nf"
viash_version: "0.9.0-RC6"
git_commit: "b84b29747d0635f2ac83ea63b496be9a9edb6724"
git_remote: "https://github.com/viash-hub/biobox"
viash_version: "0.9.0"
git_commit: "952ff0843093b538cbfd6fefdecf2e7a0bc9e70b"
git_remote: "https://x-access-token:ghs_EwAUAMYJ0K4VBHlAEMs4ZP2OyQYqJM0PSfEO@github.com/viash-hub/biobox"
git_tag: "v0.2.0-27-g952ff08"
package_config:
name: "biobox"
version: "v0.1.0"
version: "main"
description: "A collection of bioinformatics tools for working with sequence data.\n"
info: null
viash_version: "0.9.0-RC6"
viash_version: "0.9.0"
source: "src"
target: "target"
config_mods:
- ".requirements.commands := ['ps']\n"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v0.1.0'"
- ".engines[.type == 'docker'].target_tag := 'main'"
keywords:
- "bioinformatics"
- "modules"

View File

@@ -1,13 +1,16 @@
// falco v0.1.0
// falco main
//
// This wrapper script is auto-generated by viash 0.9.0-RC6 and is thus a
// derivative work thereof. This software comes with ABSOLUTELY NO WARRANTY from
// Data Intuitive.
// This wrapper script is auto-generated by viash 0.9.0 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
// Intuitive.
//
// The component may contain files which fall under a different license. The
// authors of this component should specify the license in the header of such
// files, or include a separate license file detailing the licenses of all included
// files.
//
// Component authors:
// * Toni Verbeiren (author, maintainer)
////////////////////////////
// VDSL3 helper functions //
@@ -760,8 +763,11 @@ def runEach(Map args) {
def fromState_ = args.fromState
def toState_ = args.toState
def filter_ = args.filter
def runIf_ = args.runIf
def id_ = args.id
assert !runIf_ || runIf_ instanceof Closure: "runEach: must pass a Closure to runIf."
workflow runEachWf {
take: input_ch
main:
@@ -783,7 +789,20 @@ def runEach(Map args) {
[new_id] + tup.drop(1)
}
: filter_ch
def data_ch = id_ch | map{tup ->
def chPassthrough = null
def chRun = null
if (runIf_) {
def idRunIfBranch = id_ch.branch{ tup ->
run: runIf_(tup[0], tup[1], comp_)
passthrough: true
}
chPassthrough = idRunIfBranch.passthrough
chRun = idRunIfBranch.run
} else {
chRun = id_ch
chPassthrough = Channel.empty()
}
def data_ch = chRun | map{tup ->
def new_data = tup[1]
if (fromState_ instanceof Map) {
new_data = fromState_.collectEntries{ key0, key1 ->
@@ -821,8 +840,11 @@ def runEach(Map args) {
[tup[0], new_state] + tup.drop(3)
}
: out_ch
def return_ch = post_ch
| concat(chPassthrough)
post_ch
return_ch
}
// mix all results
@@ -1598,8 +1620,8 @@ def findStates(Map params, Map config) {
// construct renameMap
if (args.rename_keys) {
def renameMap = args.rename_keys.collectEntries{renameString ->
def split = renameString.split(";")
assert split.size() == 2: "Argument 'rename_keys' should be of the form 'newKey:oldKey,newKey:oldKey'"
def split = renameString.split(":")
assert split.size() == 2: "Argument 'rename_keys' should be of the form 'newKey:oldKey', or 'newKey:oldKey;newKey:oldKey' in case of multiple values"
split
}
@@ -1709,7 +1731,9 @@ def publishStates(Map args) {
def yamlFilename = yamlTemplate_
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
// TODO: do the pathnames in state_ match up with the outputFilenames_?
@@ -1780,7 +1804,9 @@ def publishStatesByConfig(Map args) {
def yamlTemplate = params.containsKey("output_state") ? params.output_state : '$id.$key.state.yaml'
def yamlFilename = yamlTemplate
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
def yamlDir = java.nio.file.Paths.get(yamlFilename).getParent()
// the processed state is a list of [key, value, inputPath, outputFilename] tuples, where
@@ -1822,7 +1848,9 @@ def publishStatesByConfig(Map args) {
// instantiate the template
def filename = filenameTemplate
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
if (par.multiple) {
// if the parameter is multiple: true, the filename
// should contain a wildcard '*' that is replaced with
@@ -2626,30 +2654,31 @@ def workflowFactory(Map args, Map defaultWfArgs, Map meta) {
tuple
}
def chModifiedFiltered = workflowArgs.filter ?
chModified | filter{workflowArgs.filter(it)} :
chModified
def chRun = null
def chPassthrough = null
if (workflowArgs.runIf) {
def runIfBranch = chModifiedFiltered.branch{ tup ->
def runIfBranch = chModified.branch{ tup ->
run: workflowArgs.runIf(tup[0], tup[1])
passthrough: true
}
chRun = runIfBranch.run
chPassthrough = runIfBranch.passthrough
} else {
chRun = chModifiedFiltered
chRun = chModified
chPassthrough = Channel.empty()
}
def chRunFiltered = workflowArgs.filter ?
chRun | filter{workflowArgs.filter(it)} :
chRun
def chArgs = workflowArgs.fromState ?
chRun | map{
chRunFiltered | map{
def new_data = workflowArgs.fromState(it.take(2))
[it[0], new_data]
} :
chRun | map {tup -> tup.take(2)}
chRunFiltered | map {tup -> tup.take(2)}
// fill in defaults
def chArgsWithDefaults = chArgs
@@ -2720,7 +2749,7 @@ def workflowFactory(Map args, Map defaultWfArgs, Map meta) {
// | view{"chInitialOutput: ${it.take(3)}"}
// join the output [prev_id, new_id, output] with the previous state [prev_id, state, ...]
def chNewState = safeJoin(chInitialOutput, chModifiedFiltered, key_)
def chNewState = safeJoin(chInitialOutput, chRunFiltered, key_)
// input tuple format: [join_id, id, output, prev_state, ...]
// output tuple format: [join_id, id, new_state, ...]
| map{ tup ->
@@ -2779,7 +2808,29 @@ meta = [
"resources_dir": moduleDir.toRealPath().normalize(),
"config": processConfig(readJsonBlob('''{
"name" : "falco",
"version" : "v0.1.0",
"version" : "main",
"authors" : [
{
"name" : "Toni Verbeiren",
"roles" : [
"author",
"maintainer"
],
"info" : {
"links" : {
"github" : "tverbeiren",
"linkedin" : "verbeiren"
},
"organizations" : [
{
"name" : "Data Intuitive",
"href" : "https://www.data-intuitive.com",
"role" : "Data Scientist and CEO"
}
]
}
}
],
"argument_groups" : [
{
"name" : "Input arguments",
@@ -2868,7 +2919,7 @@ meta = [
},
{
"type" : "boolean_true",
"name" : "--reverse_complliment",
"name" : "--reverse_complement",
"alternatives" : [
"-r"
],
@@ -3080,7 +3131,7 @@ meta = [
"id" : "docker",
"image" : "debian:trixie-slim",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v0.1.0",
"target_tag" : "main",
"namespace_separator" : "/",
"setup" : [
{
@@ -3118,22 +3169,23 @@ meta = [
"runner" : "nextflow",
"engine" : "docker|native",
"output" : "target/nextflow/falco",
"viash_version" : "0.9.0-RC6",
"git_commit" : "b84b29747d0635f2ac83ea63b496be9a9edb6724",
"git_remote" : "https://github.com/viash-hub/biobox"
"viash_version" : "0.9.0",
"git_commit" : "952ff0843093b538cbfd6fefdecf2e7a0bc9e70b",
"git_remote" : "https://x-access-token:ghs_EwAUAMYJ0K4VBHlAEMs4ZP2OyQYqJM0PSfEO@github.com/viash-hub/biobox",
"git_tag" : "v0.2.0-27-g952ff08"
},
"package_config" : {
"name" : "biobox",
"version" : "v0.1.0",
"version" : "main",
"description" : "A collection of bioinformatics tools for working with sequence data.\n",
"viash_version" : "0.9.0-RC6",
"viash_version" : "0.9.0",
"source" : "src",
"target" : "target",
"config_mods" : [
".requirements.commands := ['ps']\n",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v0.1.0'"
".engines[.type == 'docker'].target_tag := 'main'"
],
"keywords" : [
"bioinformatics",
@@ -3168,7 +3220,7 @@ $( if [ ! -z ${VIASH_PAR_ADAPTERS+x} ]; then echo "${VIASH_PAR_ADAPTERS}" | sed
$( if [ ! -z ${VIASH_PAR_LIMITS+x} ]; then echo "${VIASH_PAR_LIMITS}" | sed "s#'#'\\"'\\"'#g;s#.*#par_limits='&'#" ; else echo "# par_limits="; fi )
$( if [ ! -z ${VIASH_PAR_SUBSAMPLE+x} ]; then echo "${VIASH_PAR_SUBSAMPLE}" | sed "s#'#'\\"'\\"'#g;s#.*#par_subsample='&'#" ; else echo "# par_subsample="; fi )
$( if [ ! -z ${VIASH_PAR_BISULFITE+x} ]; then echo "${VIASH_PAR_BISULFITE}" | sed "s#'#'\\"'\\"'#g;s#.*#par_bisulfite='&'#" ; else echo "# par_bisulfite="; fi )
$( if [ ! -z ${VIASH_PAR_REVERSE_COMPLLIMENT+x} ]; then echo "${VIASH_PAR_REVERSE_COMPLLIMENT}" | sed "s#'#'\\"'\\"'#g;s#.*#par_reverse_complliment='&'#" ; else echo "# par_reverse_complliment="; fi )
$( if [ ! -z ${VIASH_PAR_REVERSE_COMPLEMENT+x} ]; then echo "${VIASH_PAR_REVERSE_COMPLEMENT}" | sed "s#'#'\\"'\\"'#g;s#.*#par_reverse_complement='&'#" ; else echo "# par_reverse_complement="; fi )
$( if [ ! -z ${VIASH_PAR_OUTDIR+x} ]; then echo "${VIASH_PAR_OUTDIR}" | sed "s#'#'\\"'\\"'#g;s#.*#par_outdir='&'#" ; else echo "# par_outdir="; fi )
$( if [ ! -z ${VIASH_PAR_FORMAT+x} ]; then echo "${VIASH_PAR_FORMAT}" | sed "s#'#'\\"'\\"'#g;s#.*#par_format='&'#" ; else echo "# par_format="; fi )
$( if [ ! -z ${VIASH_PAR_DATA_FILENAME+x} ]; then echo "${VIASH_PAR_DATA_FILENAME}" | sed "s#'#'\\"'\\"'#g;s#.*#par_data_filename='&'#" ; else echo "# par_data_filename="; fi )
@@ -3200,7 +3252,7 @@ set -eo pipefail
[[ "\\$par_nogroup" == "false" ]] && unset par_nogroup
[[ "\\$par_bisulfite" == "false" ]] && unset par_bisulfite
[[ "\\$par_reverse_compliment" == "false" ]] && unset par_reverse_compliment
[[ "\\$par_reverse_complement" == "false" ]] && unset par_reverse_complement
IFS=";" read -ra input <<< \\$par_input
@@ -3211,7 +3263,7 @@ IFS=";" read -ra input <<< \\$par_input
\\${par_limits:+--limits "\\$par_limits"} \\\\
\\${par_subsample:+-subsample \\$par_subsample} \\\\
\\${par_bisulfite:+-bisulfite} \\\\
\\${par_reverse_compliment:+-reverse-compliment} \\\\
\\${par_reverse_complement:+-reverse-complement} \\\\
\\${par_outdir:+--outdir "\\$par_outdir"} \\\\
\\${par_format:+--format "\\$par_format"} \\\\
\\${par_data_filename:+-data-filename "\\$par_data_filename"} \\\\
@@ -3298,7 +3350,11 @@ def vdsl3WorkflowFactory(Map args, Map meta, String rawScript) {
val = val.join(par.multiple_sep)
}
if (par.direction == "output" && par.type == "file") {
val = val.replaceAll('\\$id', id).replaceAll('\\$key', key)
val = val
.replaceAll('\\$id', id)
.replaceAll('\\$\\{id\\}', id)
.replaceAll('\\$key', key)
.replaceAll('\\$\\{key\\}', key)
}
[parName, val]
}
@@ -3429,7 +3485,8 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def createParentStr = meta.config.allArguments
.findAll { it.type == "file" && it.direction == "output" && it.create_parent }
.collect { par ->
"\${ args.containsKey(\"${par.plainName}\") ? \"mkdir_parent \\\"\" + (args[\"${par.plainName}\"] instanceof String ? args[\"${par.plainName}\"] : args[\"${par.plainName}\"].join('\" \"')) + \"\\\"\" : \"\" }"
def contents = "args[\"${par.plainName}\"] instanceof List ? args[\"${par.plainName}\"].join('\" \"') : args[\"${par.plainName}\"]"
"\${ args.containsKey(\"${par.plainName}\") ? \"mkdir_parent '\" + escapeText(${contents}) + \"'\" : \"\" }"
}
.join("\n")
@@ -3437,8 +3494,8 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def inputFileExports = meta.config.allArguments
.findAll { it.type == "file" && it.direction.toLowerCase() == "input" }
.collect { par ->
def viash_par_contents = "(viash_par_${par.plainName} instanceof List ? viash_par_${par.plainName}.join(\"${par.multiple_sep}\") : viash_par_${par.plainName})"
"\n\${viash_par_${par.plainName}.empty ? \"\" : \"export VIASH_PAR_${par.plainName.toUpperCase()}=\\\"\" + ${viash_par_contents} + \"\\\"\"}"
def contents = "viash_par_${par.plainName} instanceof List ? viash_par_${par.plainName}.join(\"${par.multiple_sep}\") : viash_par_${par.plainName}"
"\n\${viash_par_${par.plainName}.empty ? \"\" : \"export VIASH_PAR_${par.plainName.toUpperCase()}='\" + escapeText(${contents}) + \"'\"}"
}
// NOTE: if using docker, use /tmp instead of tmpDir!
@@ -3475,6 +3532,7 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def procStr =
"""nextflow.enable.dsl=2
|
|def escapeText = { s -> s.toString().replaceAll("'", "'\\\"'\\\"'") }
|process $procKey {$drctvStrs
|input:
| tuple val(id)$inputPaths, val(args), path(resourcesDir, stageAs: ".viash_meta_resources")
@@ -3486,10 +3544,9 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
|$stub
|\"\"\"
|script:$assertStr
|def escapeText = { s -> s.toString().replaceAll('([`"])', '\\\\\\\\\$1') }
|def parInject = args
| .findAll{key, value -> value != null}
| .collect{key, value -> "export VIASH_PAR_\${key.toUpperCase()}=\\\"\${escapeText(value)}\\\""}
| .collect{key, value -> "export VIASH_PAR_\${key.toUpperCase()}='\${escapeText(value)}'"}
| .join("\\n")
|\"\"\"
|# meta exports
@@ -3574,7 +3631,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/biobox/falco",
"tag" : "v0.1.0"
"tag" : "main"
},
"tag" : "$id"
}'''),

View File

@@ -2,8 +2,9 @@ manifest {
name = 'falco'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v0.1.0'
version = 'main'
description = 'A C++ drop-in replacement of FastQC to assess the quality of sequence read data'
author = 'Toni Verbeiren'
}
process.container = 'nextflow/bash:latest'

View File

@@ -17,8 +17,8 @@
"input": {
"type":
"string",
"description": "Type: List of `file`, required, example: `input1.fastq;input2.fastq`, multiple_sep: `\":\"`. input fastq files",
"help_text": "Type: List of `file`, required, example: `input1.fastq;input2.fastq`, multiple_sep: `\":\"`. input fastq files"
"description": "Type: List of `file`, required, example: `input1.fastq;input2.fastq`, multiple_sep: `\";\"`. input fastq files",
"help_text": "Type: List of `file`, required, example: `input1.fastq;input2.fastq`, multiple_sep: `\";\"`. input fastq files"
}
@@ -40,7 +40,7 @@
"description": "Type: `boolean_true`, default: `false`. Disable grouping of bases for reads \u003e50bp",
"help_text": "Type: `boolean_true`, default: `false`. Disable grouping of bases for reads \u003e50bp. \nAll reports will show data for every base in \nthe read. WARNING: When using this option, \nyour plots may end up a ridiculous size. You \nhave been warned!\n"
,
"default": "False"
"default":false
}
@@ -91,18 +91,18 @@
"description": "Type: `boolean_true`, default: `false`. [Falco only] reads are whole genome \nbisulfite sequencing, and more Ts and fewer \nCs are therefore expected and will be \naccounted for in base content",
"help_text": "Type: `boolean_true`, default: `false`. [Falco only] reads are whole genome \nbisulfite sequencing, and more Ts and fewer \nCs are therefore expected and will be \naccounted for in base content.\n"
,
"default": "False"
"default":false
}
,
"reverse_complliment": {
"reverse_complement": {
"type":
"boolean",
"description": "Type: `boolean_true`, default: `false`. [Falco only] The input is a \nreverse-complement",
"help_text": "Type: `boolean_true`, default: `false`. [Falco only] The input is a \nreverse-complement. All modules will be \ntested by swapping A/T and C/G\n"
,
"default": "False"
"default":false
}
@@ -123,7 +123,7 @@
"description": "Type: `file`, required, default: `$id.$key.outdir.outdir`, example: `output`. Create all output files in the specified \noutput directory",
"help_text": "Type: `file`, required, default: `$id.$key.outdir.outdir`, example: `output`. Create all output files in the specified \noutput directory. FALCO-SPECIFIC: If the \ndirectory does not exists, the program will \ncreate it.\n"
,
"default": "$id.$key.outdir.outdir"
"default":"$id.$key.outdir.outdir"
}
@@ -146,7 +146,7 @@
"description": "Type: `file`, default: `$id.$key.data_filename.data_filename`. [Falco only] Specify filename for FastQC \ndata output (TXT)",
"help_text": "Type: `file`, default: `$id.$key.data_filename.data_filename`. [Falco only] Specify filename for FastQC \ndata output (TXT). If not specified, it will \nbe called fastq_data.txt in either the input \nfile\u0027s directory or the one specified in the \n--output flag. Only available when running \nfalco with a single input.\n"
,
"default": "$id.$key.data_filename.data_filename"
"default":"$id.$key.data_filename.data_filename"
}
@@ -157,7 +157,7 @@
"description": "Type: `file`, default: `$id.$key.report_filename.report_filename`. [Falco only] Specify filename for FastQC \nreport output (HTML)",
"help_text": "Type: `file`, default: `$id.$key.report_filename.report_filename`. [Falco only] Specify filename for FastQC \nreport output (HTML). If not specified, it \nwill be called fastq_report.html in either \nthe input file\u0027s directory or the one \nspecified in the --output flag. Only \navailable when running falco with a single \ninput.\n"
,
"default": "$id.$key.report_filename.report_filename"
"default":"$id.$key.report_filename.report_filename"
}
@@ -168,7 +168,7 @@
"description": "Type: `file`, default: `$id.$key.summary_filename.summary_filename`. [Falco only] Specify filename for the short \nsummary output (TXT)",
"help_text": "Type: `file`, default: `$id.$key.summary_filename.summary_filename`. [Falco only] Specify filename for the short \nsummary output (TXT). If not specified, it \nwill be called fastq_report.html in either \nthe input file\u0027s directory or the one \nspecified in the --output flag. Only \navailable when running falco with a single \ninput.\n"
,
"default": "$id.$key.summary_filename.summary_filename"
"default":"$id.$key.summary_filename.summary_filename"
}

View File

@@ -1,5 +1,19 @@
name: "multiqc"
version: "v0.1.0"
version: "main"
authors:
- name: "Dorien Roosen"
roles:
- "author"
- "maintainer"
info:
links:
email: "dorien@data-intuitive.com"
github: "dorien-er"
linkedin: "dorien-roosen"
organizations:
- name: "Data Intuitive"
href: "https://www.data-intuitive.com"
role: "Data Scientist"
argument_groups:
- name: "Input"
arguments:
@@ -63,39 +77,43 @@ argument_groups:
description: "Use only these module"
info: null
example:
- "fastqc,cutadapt"
- "fastqc"
- "cutadapt"
required: false
direction: "input"
multiple: true
multiple_sep: ","
multiple_sep: ";"
- type: "string"
name: "--exclude_modules"
description: "Do not use only these modules"
info: null
example:
- "fastqc,cutadapt"
- "fastqc"
- "cutadapt"
required: false
direction: "input"
multiple: true
multiple_sep: ","
multiple_sep: ";"
- type: "string"
name: "--ignore_analysis"
info: null
example:
- "run_one/*,run_two/*"
- "run_one/*"
- "run_two/*"
required: false
direction: "input"
multiple: true
multiple_sep: ","
multiple_sep: ";"
- type: "string"
name: "--ignore_samples"
info: null
example:
- "sample_1*,sample_3*"
- "sample_1*"
- "sample_3*"
required: false
direction: "input"
multiple: true
multiple_sep: ","
multiple_sep: ";"
- type: "boolean_true"
name: "--ignore_symlinks"
description: "Ignore symlinked directories and files"
@@ -415,7 +433,7 @@ engines:
id: "docker"
image: "quay.io/biocontainers/multiqc:1.21--pyhdfd78af_0"
target_registry: "images.viash-hub.com"
target_tag: "v0.1.0"
target_tag: "main"
namespace_separator: "/"
setup:
- type: "docker"
@@ -437,22 +455,23 @@ build_info:
engine: "docker|native"
output: "target/nextflow/multiqc"
executable: "target/nextflow/multiqc/main.nf"
viash_version: "0.9.0-RC6"
git_commit: "b84b29747d0635f2ac83ea63b496be9a9edb6724"
git_remote: "https://github.com/viash-hub/biobox"
viash_version: "0.9.0"
git_commit: "952ff0843093b538cbfd6fefdecf2e7a0bc9e70b"
git_remote: "https://x-access-token:ghs_EwAUAMYJ0K4VBHlAEMs4ZP2OyQYqJM0PSfEO@github.com/viash-hub/biobox"
git_tag: "v0.2.0-27-g952ff08"
package_config:
name: "biobox"
version: "v0.1.0"
version: "main"
description: "A collection of bioinformatics tools for working with sequence data.\n"
info: null
viash_version: "0.9.0-RC6"
viash_version: "0.9.0"
source: "src"
target: "target"
config_mods:
- ".requirements.commands := ['ps']\n"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v0.1.0'"
- ".engines[.type == 'docker'].target_tag := 'main'"
keywords:
- "bioinformatics"
- "modules"

View File

@@ -1,13 +1,16 @@
// multiqc v0.1.0
// multiqc main
//
// This wrapper script is auto-generated by viash 0.9.0-RC6 and is thus a
// derivative work thereof. This software comes with ABSOLUTELY NO WARRANTY from
// Data Intuitive.
// This wrapper script is auto-generated by viash 0.9.0 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
// Intuitive.
//
// The component may contain files which fall under a different license. The
// authors of this component should specify the license in the header of such
// files, or include a separate license file detailing the licenses of all included
// files.
//
// Component authors:
// * Dorien Roosen (author, maintainer)
////////////////////////////
// VDSL3 helper functions //
@@ -760,8 +763,11 @@ def runEach(Map args) {
def fromState_ = args.fromState
def toState_ = args.toState
def filter_ = args.filter
def runIf_ = args.runIf
def id_ = args.id
assert !runIf_ || runIf_ instanceof Closure: "runEach: must pass a Closure to runIf."
workflow runEachWf {
take: input_ch
main:
@@ -783,7 +789,20 @@ def runEach(Map args) {
[new_id] + tup.drop(1)
}
: filter_ch
def data_ch = id_ch | map{tup ->
def chPassthrough = null
def chRun = null
if (runIf_) {
def idRunIfBranch = id_ch.branch{ tup ->
run: runIf_(tup[0], tup[1], comp_)
passthrough: true
}
chPassthrough = idRunIfBranch.passthrough
chRun = idRunIfBranch.run
} else {
chRun = id_ch
chPassthrough = Channel.empty()
}
def data_ch = chRun | map{tup ->
def new_data = tup[1]
if (fromState_ instanceof Map) {
new_data = fromState_.collectEntries{ key0, key1 ->
@@ -821,8 +840,11 @@ def runEach(Map args) {
[tup[0], new_state] + tup.drop(3)
}
: out_ch
def return_ch = post_ch
| concat(chPassthrough)
post_ch
return_ch
}
// mix all results
@@ -1598,8 +1620,8 @@ def findStates(Map params, Map config) {
// construct renameMap
if (args.rename_keys) {
def renameMap = args.rename_keys.collectEntries{renameString ->
def split = renameString.split(";")
assert split.size() == 2: "Argument 'rename_keys' should be of the form 'newKey:oldKey,newKey:oldKey'"
def split = renameString.split(":")
assert split.size() == 2: "Argument 'rename_keys' should be of the form 'newKey:oldKey', or 'newKey:oldKey;newKey:oldKey' in case of multiple values"
split
}
@@ -1709,7 +1731,9 @@ def publishStates(Map args) {
def yamlFilename = yamlTemplate_
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
// TODO: do the pathnames in state_ match up with the outputFilenames_?
@@ -1780,7 +1804,9 @@ def publishStatesByConfig(Map args) {
def yamlTemplate = params.containsKey("output_state") ? params.output_state : '$id.$key.state.yaml'
def yamlFilename = yamlTemplate
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
def yamlDir = java.nio.file.Paths.get(yamlFilename).getParent()
// the processed state is a list of [key, value, inputPath, outputFilename] tuples, where
@@ -1822,7 +1848,9 @@ def publishStatesByConfig(Map args) {
// instantiate the template
def filename = filenameTemplate
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
if (par.multiple) {
// if the parameter is multiple: true, the filename
// should contain a wildcard '*' that is replaced with
@@ -2626,30 +2654,31 @@ def workflowFactory(Map args, Map defaultWfArgs, Map meta) {
tuple
}
def chModifiedFiltered = workflowArgs.filter ?
chModified | filter{workflowArgs.filter(it)} :
chModified
def chRun = null
def chPassthrough = null
if (workflowArgs.runIf) {
def runIfBranch = chModifiedFiltered.branch{ tup ->
def runIfBranch = chModified.branch{ tup ->
run: workflowArgs.runIf(tup[0], tup[1])
passthrough: true
}
chRun = runIfBranch.run
chPassthrough = runIfBranch.passthrough
} else {
chRun = chModifiedFiltered
chRun = chModified
chPassthrough = Channel.empty()
}
def chRunFiltered = workflowArgs.filter ?
chRun | filter{workflowArgs.filter(it)} :
chRun
def chArgs = workflowArgs.fromState ?
chRun | map{
chRunFiltered | map{
def new_data = workflowArgs.fromState(it.take(2))
[it[0], new_data]
} :
chRun | map {tup -> tup.take(2)}
chRunFiltered | map {tup -> tup.take(2)}
// fill in defaults
def chArgsWithDefaults = chArgs
@@ -2720,7 +2749,7 @@ def workflowFactory(Map args, Map defaultWfArgs, Map meta) {
// | view{"chInitialOutput: ${it.take(3)}"}
// join the output [prev_id, new_id, output] with the previous state [prev_id, state, ...]
def chNewState = safeJoin(chInitialOutput, chModifiedFiltered, key_)
def chNewState = safeJoin(chInitialOutput, chRunFiltered, key_)
// input tuple format: [join_id, id, output, prev_state, ...]
// output tuple format: [join_id, id, new_state, ...]
| map{ tup ->
@@ -2779,7 +2808,30 @@ meta = [
"resources_dir": moduleDir.toRealPath().normalize(),
"config": processConfig(readJsonBlob('''{
"name" : "multiqc",
"version" : "v0.1.0",
"version" : "main",
"authors" : [
{
"name" : "Dorien Roosen",
"roles" : [
"author",
"maintainer"
],
"info" : {
"links" : {
"email" : "dorien@data-intuitive.com",
"github" : "dorien-er",
"linkedin" : "dorien-roosen"
},
"organizations" : [
{
"name" : "Data Intuitive",
"href" : "https://www.data-intuitive.com",
"role" : "Data Scientist"
}
]
}
}
],
"argument_groups" : [
{
"name" : "Input",
@@ -2855,46 +2907,50 @@ meta = [
"name" : "--include_modules",
"description" : "Use only these module",
"example" : [
"fastqc,cutadapt"
"fastqc",
"cutadapt"
],
"required" : false,
"direction" : "input",
"multiple" : true,
"multiple_sep" : ","
"multiple_sep" : ";"
},
{
"type" : "string",
"name" : "--exclude_modules",
"description" : "Do not use only these modules",
"example" : [
"fastqc,cutadapt"
"fastqc",
"cutadapt"
],
"required" : false,
"direction" : "input",
"multiple" : true,
"multiple_sep" : ","
"multiple_sep" : ";"
},
{
"type" : "string",
"name" : "--ignore_analysis",
"example" : [
"run_one/*,run_two/*"
"run_one/*",
"run_two/*"
],
"required" : false,
"direction" : "input",
"multiple" : true,
"multiple_sep" : ","
"multiple_sep" : ";"
},
{
"type" : "string",
"name" : "--ignore_samples",
"example" : [
"sample_1*,sample_3*"
"sample_1*",
"sample_3*"
],
"required" : false,
"direction" : "input",
"multiple" : true,
"multiple_sep" : ","
"multiple_sep" : ";"
},
{
"type" : "boolean_true",
@@ -3279,7 +3335,7 @@ meta = [
"id" : "docker",
"image" : "quay.io/biocontainers/multiqc:1.21--pyhdfd78af_0",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v0.1.0",
"target_tag" : "main",
"namespace_separator" : "/",
"setup" : [
{
@@ -3309,22 +3365,23 @@ meta = [
"runner" : "nextflow",
"engine" : "docker|native",
"output" : "target/nextflow/multiqc",
"viash_version" : "0.9.0-RC6",
"git_commit" : "b84b29747d0635f2ac83ea63b496be9a9edb6724",
"git_remote" : "https://github.com/viash-hub/biobox"
"viash_version" : "0.9.0",
"git_commit" : "952ff0843093b538cbfd6fefdecf2e7a0bc9e70b",
"git_remote" : "https://x-access-token:ghs_EwAUAMYJ0K4VBHlAEMs4ZP2OyQYqJM0PSfEO@github.com/viash-hub/biobox",
"git_tag" : "v0.2.0-27-g952ff08"
},
"package_config" : {
"name" : "biobox",
"version" : "v0.1.0",
"version" : "main",
"description" : "A collection of bioinformatics tools for working with sequence data.\n",
"viash_version" : "0.9.0-RC6",
"viash_version" : "0.9.0",
"source" : "src",
"target" : "target",
"config_mods" : [
".requirements.commands := ['ps']\n",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v0.1.0'"
".engines[.type == 'docker'].target_tag := 'main'"
],
"keywords" : [
"bioinformatics",
@@ -3411,26 +3468,32 @@ $( if [ ! -z ${VIASH_META_MEMORY_PIB+x} ]; then echo "${VIASH_META_MEMORY_PIB}"
#!/bin/bash
# disable flags
[[ "\\$par_ignore_symlinks" == "false" ]] && unset par_ignore_symlinks
[[ "\\$par_dirs" == "false" ]] && unset par_dirs
[[ "\\$par_full_names" == "false" ]] && unset par_full_names
[[ "\\$par_fn_as_s_name" == "false" ]] && unset par_fn_as_s_name
[[ "\\$par_profile_runtime" == "false" ]] && unset par_profile_runtime
[[ "\\$par_verbose" == "false" ]] && unset par_verbose
[[ "\\$par_quiet" == "false" ]] && unset par_quiet
[[ "\\$par_strict" == "false" ]] && unset par_strict
[[ "\\$par_development" == "false" ]] && unset par_development
[[ "\\$par_require_logs" == "false" ]] && unset par_require_logs
[[ "\\$par_no_megaqc_upload" == "false" ]] && unset par_no_megaqc_upload
[[ "\\$par_no_ansi" == "false" ]] && unset par_no_ansi
[[ "\\$par_flat" == "false" ]] && unset par_flat
[[ "\\$par_interactive" == "false" ]] && unset par_interactive
[[ "\\$par_static_plot_export" == "false" ]] && unset par_static_plot_export
[[ "\\$par_data_dir" == "false" ]] && unset par_data_dir
[[ "\\$par_no_data_dir" == "false" ]] && unset par_no_data_dir
[[ "\\$par_zip_data_dir" == "false" ]] && unset par_zip_data_dir
[[ "\\$par_pdf" == "false" ]] && unset par_pdf
unset_if_false=(
par_ignore_symlinks
par_dirs
par_full_names
par_fn_as_s_name
par_profile_runtime
par_verbose
par_quiet
par_strict
par_development
par_require_logs
par_no_megaqc_upload
par_no_ansi
par_flat
par_interactive
par_static_plot_export
par_data_dir
par_no_data_dir
par_zip_data_dir
par_pdf
)
for par in \\${unset_if_false[@]}; do
test_val="\\${!par}"
[[ "\\$test_val" == "false" ]] && unset \\$par
done
# handle inputs
out_dir=\\$(dirname "\\$par_output_report")
@@ -3448,7 +3511,7 @@ IFS=";" read -ra inputs <<< \\$par_input
if [[ -n "\\$par_include_modules" ]]; then
include_modules=""
IFS="," read -ra incl_modules <<< \\$par_include_modules
IFS=";" read -ra incl_modules <<< \\$par_include_modules
for i in "\\${incl_modules[@]}"; do
include_modules+="--include \\$i "
done
@@ -3457,7 +3520,7 @@ fi
if [[ -n "\\$par_exclude_modules" ]]; then
exclude_modules=""
IFS="," read -ra excl_modules <<< \\$par_exclude_modules
IFS=";" read -ra excl_modules <<< \\$par_exclude_modules
for i in "\\${excl_modules[@]}"; do
exclude_modules+="--exclude \\$i"
done
@@ -3466,7 +3529,7 @@ fi
if [[ -n "\\$par_ignore_analysis" ]]; then
ignore=""
IFS="," read -ra ignore_analysis <<< \\$par_ignore_analysis
IFS=";" read -ra ignore_analysis <<< \\$par_ignore_analysis
for i in "\\${ignore_analysis[@]}"; do
ignore+="--ignore \\$i "
done
@@ -3475,7 +3538,7 @@ fi
if [[ -n "\\$par_ignore_samples" ]]; then
ignore_samples=""
IFS="," read -ra ign_samples <<< \\$par_ignore_samples
IFS=";" read -ra ign_samples <<< \\$par_ignore_samples
for i in "\\${ign_samples[@]}"; do
ignore_samples+="--ignore-samples \\$i"
done
@@ -3618,7 +3681,11 @@ def vdsl3WorkflowFactory(Map args, Map meta, String rawScript) {
val = val.join(par.multiple_sep)
}
if (par.direction == "output" && par.type == "file") {
val = val.replaceAll('\\$id', id).replaceAll('\\$key', key)
val = val
.replaceAll('\\$id', id)
.replaceAll('\\$\\{id\\}', id)
.replaceAll('\\$key', key)
.replaceAll('\\$\\{key\\}', key)
}
[parName, val]
}
@@ -3749,7 +3816,8 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def createParentStr = meta.config.allArguments
.findAll { it.type == "file" && it.direction == "output" && it.create_parent }
.collect { par ->
"\${ args.containsKey(\"${par.plainName}\") ? \"mkdir_parent \\\"\" + (args[\"${par.plainName}\"] instanceof String ? args[\"${par.plainName}\"] : args[\"${par.plainName}\"].join('\" \"')) + \"\\\"\" : \"\" }"
def contents = "args[\"${par.plainName}\"] instanceof List ? args[\"${par.plainName}\"].join('\" \"') : args[\"${par.plainName}\"]"
"\${ args.containsKey(\"${par.plainName}\") ? \"mkdir_parent '\" + escapeText(${contents}) + \"'\" : \"\" }"
}
.join("\n")
@@ -3757,8 +3825,8 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def inputFileExports = meta.config.allArguments
.findAll { it.type == "file" && it.direction.toLowerCase() == "input" }
.collect { par ->
def viash_par_contents = "(viash_par_${par.plainName} instanceof List ? viash_par_${par.plainName}.join(\"${par.multiple_sep}\") : viash_par_${par.plainName})"
"\n\${viash_par_${par.plainName}.empty ? \"\" : \"export VIASH_PAR_${par.plainName.toUpperCase()}=\\\"\" + ${viash_par_contents} + \"\\\"\"}"
def contents = "viash_par_${par.plainName} instanceof List ? viash_par_${par.plainName}.join(\"${par.multiple_sep}\") : viash_par_${par.plainName}"
"\n\${viash_par_${par.plainName}.empty ? \"\" : \"export VIASH_PAR_${par.plainName.toUpperCase()}='\" + escapeText(${contents}) + \"'\"}"
}
// NOTE: if using docker, use /tmp instead of tmpDir!
@@ -3795,6 +3863,7 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def procStr =
"""nextflow.enable.dsl=2
|
|def escapeText = { s -> s.toString().replaceAll("'", "'\\\"'\\\"'") }
|process $procKey {$drctvStrs
|input:
| tuple val(id)$inputPaths, val(args), path(resourcesDir, stageAs: ".viash_meta_resources")
@@ -3806,10 +3875,9 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
|$stub
|\"\"\"
|script:$assertStr
|def escapeText = { s -> s.toString().replaceAll('([`"])', '\\\\\\\\\$1') }
|def parInject = args
| .findAll{key, value -> value != null}
| .collect{key, value -> "export VIASH_PAR_\${key.toUpperCase()}=\\\"\${escapeText(value)}\\\""}
| .collect{key, value -> "export VIASH_PAR_\${key.toUpperCase()}='\${escapeText(value)}'"}
| .join("\\n")
|\"\"\"
|# meta exports
@@ -3894,7 +3962,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/biobox/multiqc",
"tag" : "v0.1.0"
"tag" : "main"
},
"tag" : "$id"
}'''),

View File

@@ -2,8 +2,9 @@ manifest {
name = 'multiqc'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v0.1.0'
version = 'main'
description = 'MultiQC aggregates results from bioinformatics analyses across many samples into a single report.\nIt searches a given directory for analysis logs and compiles a HTML report. It\'s a general use tool, perfect for summarising the output from numerous bioinformatics tools.\n'
author = 'Dorien Roosen'
}
process.container = 'nextflow/bash:latest'

View File

@@ -17,8 +17,8 @@
"input": {
"type":
"string",
"description": "Type: List of `file`, required, example: `data/results/`, multiple_sep: `\":\"`. File paths to be searched for analysis results to be included in the report",
"help_text": "Type: List of `file`, required, example: `data/results/`, multiple_sep: `\":\"`. File paths to be searched for analysis results to be included in the report.\n"
"description": "Type: List of `file`, required, example: `data/results`, multiple_sep: `\";\"`. File paths to be searched for analysis results to be included in the report",
"help_text": "Type: List of `file`, required, example: `data/results`, multiple_sep: `\";\"`. File paths to be searched for analysis results to be included in the report.\n"
}
@@ -40,7 +40,7 @@
"description": "Type: `file`, default: `$id.$key.output_report.html`, example: `multiqc_report.html`. Filepath of the generated report",
"help_text": "Type: `file`, default: `$id.$key.output_report.html`, example: `multiqc_report.html`. Filepath of the generated report.\n"
,
"default": "$id.$key.output_report.html"
"default":"$id.$key.output_report.html"
}
@@ -51,7 +51,7 @@
"description": "Type: `file`, default: `$id.$key.output_data.output_data`, example: `multiqc_data`. Output directory for parsed data files",
"help_text": "Type: `file`, default: `$id.$key.output_data.output_data`, example: `multiqc_data`. Output directory for parsed data files. If not provided, parsed data will not be published.\n"
,
"default": "$id.$key.output_data.output_data"
"default":"$id.$key.output_data.output_data"
}
@@ -62,7 +62,7 @@
"description": "Type: `file`, default: `$id.$key.output_plots.output_plots`, example: `multiqc_plots`. Output directory for generated plots",
"help_text": "Type: `file`, default: `$id.$key.output_plots.output_plots`, example: `multiqc_plots`. Output directory for generated plots. If not provided, plots will not be published.\n"
,
"default": "$id.$key.output_plots.output_plots"
"default":"$id.$key.output_plots.output_plots"
}
@@ -80,8 +80,8 @@
"include_modules": {
"type":
"string",
"description": "Type: List of `string`, example: `fastqc,cutadapt`, multiple_sep: `\",\"`. Use only these module",
"help_text": "Type: List of `string`, example: `fastqc,cutadapt`, multiple_sep: `\",\"`. Use only these module"
"description": "Type: List of `string`, example: `fastqc;cutadapt`, multiple_sep: `\";\"`. Use only these module",
"help_text": "Type: List of `string`, example: `fastqc;cutadapt`, multiple_sep: `\";\"`. Use only these module"
}
@@ -90,8 +90,8 @@
"exclude_modules": {
"type":
"string",
"description": "Type: List of `string`, example: `fastqc,cutadapt`, multiple_sep: `\",\"`. Do not use only these modules",
"help_text": "Type: List of `string`, example: `fastqc,cutadapt`, multiple_sep: `\",\"`. Do not use only these modules"
"description": "Type: List of `string`, example: `fastqc;cutadapt`, multiple_sep: `\";\"`. Do not use only these modules",
"help_text": "Type: List of `string`, example: `fastqc;cutadapt`, multiple_sep: `\";\"`. Do not use only these modules"
}
@@ -100,8 +100,8 @@
"ignore_analysis": {
"type":
"string",
"description": "Type: List of `string`, example: `run_one/*,run_two/*`, multiple_sep: `\",\"`. ",
"help_text": "Type: List of `string`, example: `run_one/*,run_two/*`, multiple_sep: `\",\"`. "
"description": "Type: List of `string`, example: `run_one/*;run_two/*`, multiple_sep: `\";\"`. ",
"help_text": "Type: List of `string`, example: `run_one/*;run_two/*`, multiple_sep: `\";\"`. "
}
@@ -110,8 +110,8 @@
"ignore_samples": {
"type":
"string",
"description": "Type: List of `string`, example: `sample_1*,sample_3*`, multiple_sep: `\",\"`. ",
"help_text": "Type: List of `string`, example: `sample_1*,sample_3*`, multiple_sep: `\",\"`. "
"description": "Type: List of `string`, example: `sample_1*;sample_3*`, multiple_sep: `\";\"`. ",
"help_text": "Type: List of `string`, example: `sample_1*;sample_3*`, multiple_sep: `\";\"`. "
}
@@ -123,7 +123,7 @@
"description": "Type: `boolean_true`, default: `false`. Ignore symlinked directories and files",
"help_text": "Type: `boolean_true`, default: `false`. Ignore symlinked directories and files"
,
"default": "False"
"default":false
}
@@ -144,7 +144,7 @@
"description": "Type: `boolean_true`, default: `false`. Prepend directory to sample names to avoid clashing filenames",
"help_text": "Type: `boolean_true`, default: `false`. Prepend directory to sample names to avoid clashing filenames"
,
"default": "False"
"default":false
}
@@ -165,7 +165,7 @@
"description": "Type: `boolean_true`, default: `false`. Do not clean the sample names (leave as full file name)",
"help_text": "Type: `boolean_true`, default: `false`. Do not clean the sample names (leave as full file name)"
,
"default": "False"
"default":false
}
@@ -176,7 +176,7 @@
"description": "Type: `boolean_true`, default: `false`. Use the log filename as the sample name",
"help_text": "Type: `boolean_true`, default: `false`. Use the log filename as the sample name"
,
"default": "False"
"default":false
}
@@ -269,7 +269,7 @@
"description": "Type: `boolean_true`, default: `false`. Add analysis of how long MultiQC takes to run to the report\n",
"help_text": "Type: `boolean_true`, default: `false`. Add analysis of how long MultiQC takes to run to the report\n"
,
"default": "False"
"default":false
}
@@ -290,7 +290,7 @@
"description": "Type: `boolean_true`, default: `false`. Increase output verbosity",
"help_text": "Type: `boolean_true`, default: `false`. Increase output verbosity.\n"
,
"default": "False"
"default":false
}
@@ -301,7 +301,7 @@
"description": "Type: `boolean_true`, default: `false`. Only show log warnings\n",
"help_text": "Type: `boolean_true`, default: `false`. Only show log warnings\n"
,
"default": "False"
"default":false
}
@@ -312,7 +312,7 @@
"description": "Type: `boolean_true`, default: `false`. Don\u0027t catch exceptions, run additional code checks to help development",
"help_text": "Type: `boolean_true`, default: `false`. Don\u0027t catch exceptions, run additional code checks to help development.\n"
,
"default": "False"
"default":false
}
@@ -323,7 +323,7 @@
"description": "Type: `boolean_true`, default: `false`. Development mode",
"help_text": "Type: `boolean_true`, default: `false`. Development mode. Do not compress and minimise JS, export uncompressed plot data.\n"
,
"default": "False"
"default":false
}
@@ -334,7 +334,7 @@
"description": "Type: `boolean_true`, default: `false`. Require all explicitly requested modules to have log files",
"help_text": "Type: `boolean_true`, default: `false`. Require all explicitly requested modules to have log files. If not, MultiQC will exit with an error.\n"
,
"default": "False"
"default":false
}
@@ -345,7 +345,7 @@
"description": "Type: `boolean_true`, default: `false`. Don\u0027t upload generated report to MegaQC, even if MegaQC options are found",
"help_text": "Type: `boolean_true`, default: `false`. Don\u0027t upload generated report to MegaQC, even if MegaQC options are found.\n"
,
"default": "False"
"default":false
}
@@ -356,7 +356,7 @@
"description": "Type: `boolean_true`, default: `false`. Disable coloured log output",
"help_text": "Type: `boolean_true`, default: `false`. Disable coloured log output.\n"
,
"default": "False"
"default":false
}
@@ -387,7 +387,7 @@
"description": "Type: `boolean_true`, default: `false`. Use only flat plots (static images)",
"help_text": "Type: `boolean_true`, default: `false`. Use only flat plots (static images).\n"
,
"default": "False"
"default":false
}
@@ -398,7 +398,7 @@
"description": "Type: `boolean_true`, default: `false`. Use only interactive plots (in-browser Javascript)",
"help_text": "Type: `boolean_true`, default: `false`. Use only interactive plots (in-browser Javascript).\n"
,
"default": "False"
"default":false
}
@@ -409,7 +409,7 @@
"description": "Type: `boolean_true`, default: `false`. Force the parsed data directory to be created",
"help_text": "Type: `boolean_true`, default: `false`. Force the parsed data directory to be created.\n"
,
"default": "False"
"default":false
}
@@ -420,7 +420,7 @@
"description": "Type: `boolean_true`, default: `false`. Prevent the parsed data directory from being created",
"help_text": "Type: `boolean_true`, default: `false`. Prevent the parsed data directory from being created.\n"
,
"default": "False"
"default":false
}
@@ -431,7 +431,7 @@
"description": "Type: `boolean_true`, default: `false`. Compress the data directory",
"help_text": "Type: `boolean_true`, default: `false`. Compress the data directory.\n"
,
"default": "False"
"default":false
}
@@ -454,7 +454,7 @@
"description": "Type: `boolean_true`, default: `false`. Creates PDF report with the \u0027simple\u0027 template",
"help_text": "Type: `boolean_true`, default: `false`. Creates PDF report with the \u0027simple\u0027 template. Requires Pandoc to be installed.\n"
,
"default": "False"
"default":false
}

View File

@@ -1,6 +1,20 @@
name: "samtools_stats"
namespace: "samtools"
version: "v0.1.0"
version: "main"
authors:
- name: "Emma Rousseau"
roles:
- "author"
- "maintainer"
info:
links:
email: "emma@data-intuitive.com"
github: "emmarousseau"
linkedin: "emmarousseau1"
organizations:
- name: "Data Intuitive"
href: "https://www.data-intuitive.com"
role: "Bioinformatician"
argument_groups:
- name: "Inputs"
arguments:
@@ -38,12 +52,16 @@ argument_groups:
name: "--coverage"
alternatives:
- "-c"
description: "Coverage distribution min,max,step [1,1000,1].\n"
description: "Coverage distribution min;max;step. Default: [1, 1000, 1].\n"
info: null
example:
- 1
- 1000
- 1
required: false
direction: "input"
multiple: true
multiple_sep: ","
multiple_sep: ";"
- type: "boolean_true"
name: "--remove_dups"
alternatives:
@@ -62,9 +80,10 @@ argument_groups:
name: "--required_flag"
alternatives:
- "-f"
description: "Required flag, 0 for unset. See also `samtools flags`.\n"
description: "Required flag, 0 for unset. See also `samtools flags`. Default:\
\ `\"0\"`.\n"
info: null
default:
example:
- "0"
required: false
direction: "input"
@@ -74,9 +93,10 @@ argument_groups:
name: "--filtering_flag"
alternatives:
- "-F"
description: "Filtering flag, 0 for unset. See also `samtools flags`.\n"
description: "Filtering flag, 0 for unset. See also `samtools flags`. Default:\
\ `0`.\n"
info: null
default:
example:
- "0"
required: false
direction: "input"
@@ -85,9 +105,9 @@ argument_groups:
- type: "double"
name: "--GC_depth"
description: "The size of GC-depth bins (decreasing bin size increases memory\
\ requirement).\n"
\ requirement). Default: `20000`.\n"
info: null
default:
example:
- 20000.0
required: false
direction: "input"
@@ -97,9 +117,9 @@ argument_groups:
name: "--insert_size"
alternatives:
- "-i"
description: "Maximum insert size.\n"
description: "Maximum insert size. Default: `8000`.\n"
info: null
default:
example:
- 8000
required: false
direction: "input"
@@ -119,9 +139,10 @@ argument_groups:
name: "--read_length"
alternatives:
- "-l"
description: "Include in the statistics only reads with the given read length.\n"
description: "Include in the statistics only reads with the given read length.\
\ Default: `-1`.\n"
info: null
default:
example:
- -1
required: false
direction: "input"
@@ -131,9 +152,9 @@ argument_groups:
name: "--most_inserts"
alternatives:
- "-m"
description: "Report only the main part of inserts.\n"
description: "Report only the main part of inserts. Default: `0.99`.\n"
info: null
default:
example:
- 0.99
required: false
direction: "input"
@@ -154,9 +175,9 @@ argument_groups:
name: "--trim_quality"
alternatives:
- "-q"
description: "The BWA trimming parameter.\n"
description: "The BWA trimming parameter. Default: `0`.\n"
info: null
default:
example:
- 0
required: false
direction: "input"
@@ -218,9 +239,9 @@ argument_groups:
alternatives:
- "-g"
description: "Only bases with coverage above this value will be included in the\
\ target percentage computation.\n"
\ target percentage computation. Default: `0`.\n"
info: null
default:
example:
- 0
required: false
direction: "input"
@@ -253,7 +274,7 @@ argument_groups:
- "-o"
description: "Output file.\n"
info: null
default:
example:
- "out.txt"
must_exist: true
create_parent: true
@@ -362,7 +383,7 @@ engines:
id: "docker"
image: "quay.io/biocontainers/samtools:1.19.2--h50ea8bc_1"
target_registry: "images.viash-hub.com"
target_tag: "v0.1.0"
target_tag: "main"
namespace_separator: "/"
setup:
- type: "docker"
@@ -379,22 +400,23 @@ build_info:
engine: "docker|native"
output: "target/nextflow/samtools/samtools_stats"
executable: "target/nextflow/samtools/samtools_stats/main.nf"
viash_version: "0.9.0-RC6"
git_commit: "b84b29747d0635f2ac83ea63b496be9a9edb6724"
git_remote: "https://github.com/viash-hub/biobox"
viash_version: "0.9.0"
git_commit: "952ff0843093b538cbfd6fefdecf2e7a0bc9e70b"
git_remote: "https://x-access-token:ghs_EwAUAMYJ0K4VBHlAEMs4ZP2OyQYqJM0PSfEO@github.com/viash-hub/biobox"
git_tag: "v0.2.0-27-g952ff08"
package_config:
name: "biobox"
version: "v0.1.0"
version: "main"
description: "A collection of bioinformatics tools for working with sequence data.\n"
info: null
viash_version: "0.9.0-RC6"
viash_version: "0.9.0"
source: "src"
target: "target"
config_mods:
- ".requirements.commands := ['ps']\n"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v0.1.0'"
- ".engines[.type == 'docker'].target_tag := 'main'"
keywords:
- "bioinformatics"
- "modules"

View File

@@ -1,13 +1,16 @@
// samtools_stats v0.1.0
// samtools_stats main
//
// This wrapper script is auto-generated by viash 0.9.0-RC6 and is thus a
// derivative work thereof. This software comes with ABSOLUTELY NO WARRANTY from
// Data Intuitive.
// This wrapper script is auto-generated by viash 0.9.0 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
// Intuitive.
//
// The component may contain files which fall under a different license. The
// authors of this component should specify the license in the header of such
// files, or include a separate license file detailing the licenses of all included
// files.
//
// Component authors:
// * Emma Rousseau (author, maintainer)
////////////////////////////
// VDSL3 helper functions //
@@ -760,8 +763,11 @@ def runEach(Map args) {
def fromState_ = args.fromState
def toState_ = args.toState
def filter_ = args.filter
def runIf_ = args.runIf
def id_ = args.id
assert !runIf_ || runIf_ instanceof Closure: "runEach: must pass a Closure to runIf."
workflow runEachWf {
take: input_ch
main:
@@ -783,7 +789,20 @@ def runEach(Map args) {
[new_id] + tup.drop(1)
}
: filter_ch
def data_ch = id_ch | map{tup ->
def chPassthrough = null
def chRun = null
if (runIf_) {
def idRunIfBranch = id_ch.branch{ tup ->
run: runIf_(tup[0], tup[1], comp_)
passthrough: true
}
chPassthrough = idRunIfBranch.passthrough
chRun = idRunIfBranch.run
} else {
chRun = id_ch
chPassthrough = Channel.empty()
}
def data_ch = chRun | map{tup ->
def new_data = tup[1]
if (fromState_ instanceof Map) {
new_data = fromState_.collectEntries{ key0, key1 ->
@@ -821,8 +840,11 @@ def runEach(Map args) {
[tup[0], new_state] + tup.drop(3)
}
: out_ch
def return_ch = post_ch
| concat(chPassthrough)
post_ch
return_ch
}
// mix all results
@@ -1598,8 +1620,8 @@ def findStates(Map params, Map config) {
// construct renameMap
if (args.rename_keys) {
def renameMap = args.rename_keys.collectEntries{renameString ->
def split = renameString.split(";")
assert split.size() == 2: "Argument 'rename_keys' should be of the form 'newKey:oldKey,newKey:oldKey'"
def split = renameString.split(":")
assert split.size() == 2: "Argument 'rename_keys' should be of the form 'newKey:oldKey', or 'newKey:oldKey;newKey:oldKey' in case of multiple values"
split
}
@@ -1709,7 +1731,9 @@ def publishStates(Map args) {
def yamlFilename = yamlTemplate_
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
// TODO: do the pathnames in state_ match up with the outputFilenames_?
@@ -1780,7 +1804,9 @@ def publishStatesByConfig(Map args) {
def yamlTemplate = params.containsKey("output_state") ? params.output_state : '$id.$key.state.yaml'
def yamlFilename = yamlTemplate
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
def yamlDir = java.nio.file.Paths.get(yamlFilename).getParent()
// the processed state is a list of [key, value, inputPath, outputFilename] tuples, where
@@ -1822,7 +1848,9 @@ def publishStatesByConfig(Map args) {
// instantiate the template
def filename = filenameTemplate
.replaceAll('\\$id', id_)
.replaceAll('\\$\\{id\\}', id_)
.replaceAll('\\$key', key_)
.replaceAll('\\$\\{key\\}', key_)
if (par.multiple) {
// if the parameter is multiple: true, the filename
// should contain a wildcard '*' that is replaced with
@@ -2626,30 +2654,31 @@ def workflowFactory(Map args, Map defaultWfArgs, Map meta) {
tuple
}
def chModifiedFiltered = workflowArgs.filter ?
chModified | filter{workflowArgs.filter(it)} :
chModified
def chRun = null
def chPassthrough = null
if (workflowArgs.runIf) {
def runIfBranch = chModifiedFiltered.branch{ tup ->
def runIfBranch = chModified.branch{ tup ->
run: workflowArgs.runIf(tup[0], tup[1])
passthrough: true
}
chRun = runIfBranch.run
chPassthrough = runIfBranch.passthrough
} else {
chRun = chModifiedFiltered
chRun = chModified
chPassthrough = Channel.empty()
}
def chRunFiltered = workflowArgs.filter ?
chRun | filter{workflowArgs.filter(it)} :
chRun
def chArgs = workflowArgs.fromState ?
chRun | map{
chRunFiltered | map{
def new_data = workflowArgs.fromState(it.take(2))
[it[0], new_data]
} :
chRun | map {tup -> tup.take(2)}
chRunFiltered | map {tup -> tup.take(2)}
// fill in defaults
def chArgsWithDefaults = chArgs
@@ -2720,7 +2749,7 @@ def workflowFactory(Map args, Map defaultWfArgs, Map meta) {
// | view{"chInitialOutput: ${it.take(3)}"}
// join the output [prev_id, new_id, output] with the previous state [prev_id, state, ...]
def chNewState = safeJoin(chInitialOutput, chModifiedFiltered, key_)
def chNewState = safeJoin(chInitialOutput, chRunFiltered, key_)
// input tuple format: [join_id, id, output, prev_state, ...]
// output tuple format: [join_id, id, new_state, ...]
| map{ tup ->
@@ -2780,7 +2809,30 @@ meta = [
"config": processConfig(readJsonBlob('''{
"name" : "samtools_stats",
"namespace" : "samtools",
"version" : "v0.1.0",
"version" : "main",
"authors" : [
{
"name" : "Emma Rousseau",
"roles" : [
"author",
"maintainer"
],
"info" : {
"links" : {
"email" : "emma@data-intuitive.com",
"github" : "emmarousseau",
"linkedin" : "emmarousseau1"
},
"organizations" : [
{
"name" : "Data Intuitive",
"href" : "https://www.data-intuitive.com",
"role" : "Bioinformatician"
}
]
}
}
],
"argument_groups" : [
{
"name" : "Inputs",
@@ -2824,11 +2876,16 @@ meta = [
"alternatives" : [
"-c"
],
"description" : "Coverage distribution min,max,step [1,1000,1].\n",
"description" : "Coverage distribution min;max;step. Default: [1, 1000, 1].\n",
"example" : [
1,
1000,
1
],
"required" : false,
"direction" : "input",
"multiple" : true,
"multiple_sep" : ","
"multiple_sep" : ";"
},
{
"type" : "boolean_true",
@@ -2854,8 +2911,8 @@ meta = [
"alternatives" : [
"-f"
],
"description" : "Required flag, 0 for unset. See also `samtools flags`.\n",
"default" : [
"description" : "Required flag, 0 for unset. See also `samtools flags`. Default: `\\"0\\"`.\n",
"example" : [
"0"
],
"required" : false,
@@ -2869,8 +2926,8 @@ meta = [
"alternatives" : [
"-F"
],
"description" : "Filtering flag, 0 for unset. See also `samtools flags`.\n",
"default" : [
"description" : "Filtering flag, 0 for unset. See also `samtools flags`. Default: `0`.\n",
"example" : [
"0"
],
"required" : false,
@@ -2881,8 +2938,8 @@ meta = [
{
"type" : "double",
"name" : "--GC_depth",
"description" : "The size of GC-depth bins (decreasing bin size increases memory requirement).\n",
"default" : [
"description" : "The size of GC-depth bins (decreasing bin size increases memory requirement). Default: `20000`.\n",
"example" : [
20000.0
],
"required" : false,
@@ -2896,8 +2953,8 @@ meta = [
"alternatives" : [
"-i"
],
"description" : "Maximum insert size.\n",
"default" : [
"description" : "Maximum insert size. Default: `8000`.\n",
"example" : [
8000
],
"required" : false,
@@ -2923,8 +2980,8 @@ meta = [
"alternatives" : [
"-l"
],
"description" : "Include in the statistics only reads with the given read length.\n",
"default" : [
"description" : "Include in the statistics only reads with the given read length. Default: `-1`.\n",
"example" : [
-1
],
"required" : false,
@@ -2938,8 +2995,8 @@ meta = [
"alternatives" : [
"-m"
],
"description" : "Report only the main part of inserts.\n",
"default" : [
"description" : "Report only the main part of inserts. Default: `0.99`.\n",
"example" : [
0.99
],
"required" : false,
@@ -2965,8 +3022,8 @@ meta = [
"alternatives" : [
"-q"
],
"description" : "The BWA trimming parameter.\n",
"default" : [
"description" : "The BWA trimming parameter. Default: `0`.\n",
"example" : [
0
],
"required" : false,
@@ -3038,8 +3095,8 @@ meta = [
"alternatives" : [
"-g"
],
"description" : "Only bases with coverage above this value will be included in the target percentage computation.\n",
"default" : [
"description" : "Only bases with coverage above this value will be included in the target percentage computation. Default: `0`.\n",
"example" : [
0
],
"required" : false,
@@ -3079,7 +3136,7 @@ meta = [
"-o"
],
"description" : "Output file.\n",
"default" : [
"example" : [
"out.txt"
],
"must_exist" : true,
@@ -3216,7 +3273,7 @@ meta = [
"id" : "docker",
"image" : "quay.io/biocontainers/samtools:1.19.2--h50ea8bc_1",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v0.1.0",
"target_tag" : "main",
"namespace_separator" : "/",
"setup" : [
{
@@ -3237,22 +3294,23 @@ meta = [
"runner" : "nextflow",
"engine" : "docker|native",
"output" : "target/nextflow/samtools/samtools_stats",
"viash_version" : "0.9.0-RC6",
"git_commit" : "b84b29747d0635f2ac83ea63b496be9a9edb6724",
"git_remote" : "https://github.com/viash-hub/biobox"
"viash_version" : "0.9.0",
"git_commit" : "952ff0843093b538cbfd6fefdecf2e7a0bc9e70b",
"git_remote" : "https://x-access-token:ghs_EwAUAMYJ0K4VBHlAEMs4ZP2OyQYqJM0PSfEO@github.com/viash-hub/biobox",
"git_tag" : "v0.2.0-27-g952ff08"
},
"package_config" : {
"name" : "biobox",
"version" : "v0.1.0",
"version" : "main",
"description" : "A collection of bioinformatics tools for working with sequence data.\n",
"viash_version" : "0.9.0-RC6",
"viash_version" : "0.9.0",
"source" : "src",
"target" : "target",
"config_mods" : [
".requirements.commands := ['ps']\n",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v0.1.0'"
".engines[.type == 'docker'].target_tag := 'main'"
],
"keywords" : [
"bioinformatics",
@@ -3334,6 +3392,9 @@ set -e
[[ "\\$par_sparse" == "false" ]] && unset par_sparse
[[ "\\$par_remove_overlaps" == "false" ]] && unset par_remove_overlaps
# change the coverage input from X;X;X to X,X,X
par_coverage=\\$(echo "\\$par_coverage" | tr ';' ',')
samtools stats \\\\
\\${par_coverage:+-c "\\$par_coverage"} \\\\
\\${par_remove_dups:+-d} \\\\
@@ -3438,7 +3499,11 @@ def vdsl3WorkflowFactory(Map args, Map meta, String rawScript) {
val = val.join(par.multiple_sep)
}
if (par.direction == "output" && par.type == "file") {
val = val.replaceAll('\\$id', id).replaceAll('\\$key', key)
val = val
.replaceAll('\\$id', id)
.replaceAll('\\$\\{id\\}', id)
.replaceAll('\\$key', key)
.replaceAll('\\$\\{key\\}', key)
}
[parName, val]
}
@@ -3569,7 +3634,8 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def createParentStr = meta.config.allArguments
.findAll { it.type == "file" && it.direction == "output" && it.create_parent }
.collect { par ->
"\${ args.containsKey(\"${par.plainName}\") ? \"mkdir_parent \\\"\" + (args[\"${par.plainName}\"] instanceof String ? args[\"${par.plainName}\"] : args[\"${par.plainName}\"].join('\" \"')) + \"\\\"\" : \"\" }"
def contents = "args[\"${par.plainName}\"] instanceof List ? args[\"${par.plainName}\"].join('\" \"') : args[\"${par.plainName}\"]"
"\${ args.containsKey(\"${par.plainName}\") ? \"mkdir_parent '\" + escapeText(${contents}) + \"'\" : \"\" }"
}
.join("\n")
@@ -3577,8 +3643,8 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def inputFileExports = meta.config.allArguments
.findAll { it.type == "file" && it.direction.toLowerCase() == "input" }
.collect { par ->
def viash_par_contents = "(viash_par_${par.plainName} instanceof List ? viash_par_${par.plainName}.join(\"${par.multiple_sep}\") : viash_par_${par.plainName})"
"\n\${viash_par_${par.plainName}.empty ? \"\" : \"export VIASH_PAR_${par.plainName.toUpperCase()}=\\\"\" + ${viash_par_contents} + \"\\\"\"}"
def contents = "viash_par_${par.plainName} instanceof List ? viash_par_${par.plainName}.join(\"${par.multiple_sep}\") : viash_par_${par.plainName}"
"\n\${viash_par_${par.plainName}.empty ? \"\" : \"export VIASH_PAR_${par.plainName.toUpperCase()}='\" + escapeText(${contents}) + \"'\"}"
}
// NOTE: if using docker, use /tmp instead of tmpDir!
@@ -3615,6 +3681,7 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
def procStr =
"""nextflow.enable.dsl=2
|
|def escapeText = { s -> s.toString().replaceAll("'", "'\\\"'\\\"'") }
|process $procKey {$drctvStrs
|input:
| tuple val(id)$inputPaths, val(args), path(resourcesDir, stageAs: ".viash_meta_resources")
@@ -3626,10 +3693,9 @@ def _vdsl3ProcessFactory(Map workflowArgs, Map meta, String rawScript) {
|$stub
|\"\"\"
|script:$assertStr
|def escapeText = { s -> s.toString().replaceAll('([`"])', '\\\\\\\\\$1') }
|def parInject = args
| .findAll{key, value -> value != null}
| .collect{key, value -> "export VIASH_PAR_\${key.toUpperCase()}=\\\"\${escapeText(value)}\\\""}
| .collect{key, value -> "export VIASH_PAR_\${key.toUpperCase()}='\${escapeText(value)}'"}
| .join("\\n")
|\"\"\"
|# meta exports
@@ -3714,7 +3780,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/biobox/samtools/samtools_stats",
"tag" : "v0.1.0"
"tag" : "main"
},
"tag" : "$id"
}'''),

View File

@@ -2,8 +2,9 @@ manifest {
name = 'samtools/samtools_stats'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v0.1.0'
version = 'main'
description = 'Reports alignment summary statistics for a BAM file.'
author = 'Emma Rousseau'
}
process.container = 'nextflow/bash:latest'

View File

@@ -47,8 +47,8 @@
"coverage": {
"type":
"string",
"description": "Type: List of `integer`, multiple_sep: `\",\"`. Coverage distribution min,max,step [1,1000,1]",
"help_text": "Type: List of `integer`, multiple_sep: `\",\"`. Coverage distribution min,max,step [1,1000,1].\n"
"description": "Type: List of `integer`, example: `1;1000;1`, multiple_sep: `\";\"`. Coverage distribution min;max;step",
"help_text": "Type: List of `integer`, example: `1;1000;1`, multiple_sep: `\";\"`. Coverage distribution min;max;step. Default: [1, 1000, 1].\n"
}
@@ -60,7 +60,7 @@
"description": "Type: `boolean_true`, default: `false`. Exclude from statistics reads marked as duplicates",
"help_text": "Type: `boolean_true`, default: `false`. Exclude from statistics reads marked as duplicates.\n"
,
"default": "False"
"default":false
}
@@ -71,7 +71,7 @@
"description": "Type: `boolean_true`, default: `false`. Use a customized index file",
"help_text": "Type: `boolean_true`, default: `false`. Use a customized index file.\n"
,
"default": "False"
"default":false
}
@@ -79,10 +79,9 @@
"required_flag": {
"type":
"string",
"description": "Type: `string`, default: `0`. Required flag, 0 for unset",
"help_text": "Type: `string`, default: `0`. Required flag, 0 for unset. See also `samtools flags`.\n"
,
"default": "0"
"description": "Type: `string`, example: `0`. Required flag, 0 for unset",
"help_text": "Type: `string`, example: `0`. Required flag, 0 for unset. See also `samtools flags`. Default: `\"0\"`.\n"
}
@@ -90,10 +89,9 @@
"filtering_flag": {
"type":
"string",
"description": "Type: `string`, default: `0`. Filtering flag, 0 for unset",
"help_text": "Type: `string`, default: `0`. Filtering flag, 0 for unset. See also `samtools flags`.\n"
,
"default": "0"
"description": "Type: `string`, example: `0`. Filtering flag, 0 for unset",
"help_text": "Type: `string`, example: `0`. Filtering flag, 0 for unset. See also `samtools flags`. Default: `0`.\n"
}
@@ -101,10 +99,9 @@
"GC_depth": {
"type":
"number",
"description": "Type: `double`, default: `20000.0`. The size of GC-depth bins (decreasing bin size increases memory requirement)",
"help_text": "Type: `double`, default: `20000.0`. The size of GC-depth bins (decreasing bin size increases memory requirement).\n"
,
"default": "20000.0"
"description": "Type: `double`, example: `20000.0`. The size of GC-depth bins (decreasing bin size increases memory requirement)",
"help_text": "Type: `double`, example: `20000.0`. The size of GC-depth bins (decreasing bin size increases memory requirement). Default: `20000`.\n"
}
@@ -112,10 +109,9 @@
"insert_size": {
"type":
"integer",
"description": "Type: `integer`, default: `8000`. Maximum insert size",
"help_text": "Type: `integer`, default: `8000`. Maximum insert size.\n"
,
"default": "8000"
"description": "Type: `integer`, example: `8000`. Maximum insert size",
"help_text": "Type: `integer`, example: `8000`. Maximum insert size. Default: `8000`.\n"
}
@@ -133,10 +129,9 @@
"read_length": {
"type":
"integer",
"description": "Type: `integer`, default: `-1`. Include in the statistics only reads with the given read length",
"help_text": "Type: `integer`, default: `-1`. Include in the statistics only reads with the given read length.\n"
,
"default": "-1"
"description": "Type: `integer`, example: `-1`. Include in the statistics only reads with the given read length",
"help_text": "Type: `integer`, example: `-1`. Include in the statistics only reads with the given read length. Default: `-1`.\n"
}
@@ -144,10 +139,9 @@
"most_inserts": {
"type":
"number",
"description": "Type: `double`, default: `0.99`. Report only the main part of inserts",
"help_text": "Type: `double`, default: `0.99`. Report only the main part of inserts.\n"
,
"default": "0.99"
"description": "Type: `double`, example: `0.99`. Report only the main part of inserts",
"help_text": "Type: `double`, example: `0.99`. Report only the main part of inserts. Default: `0.99`.\n"
}
@@ -165,10 +159,9 @@
"trim_quality": {
"type":
"integer",
"description": "Type: `integer`, default: `0`. The BWA trimming parameter",
"help_text": "Type: `integer`, default: `0`. The BWA trimming parameter.\n"
,
"default": "0"
"description": "Type: `integer`, example: `0`. The BWA trimming parameter",
"help_text": "Type: `integer`, example: `0`. The BWA trimming parameter. Default: `0`.\n"
}
@@ -209,7 +202,7 @@
"description": "Type: `boolean_true`, default: `false`. Suppress outputting IS rows where there are no insertions",
"help_text": "Type: `boolean_true`, default: `false`. Suppress outputting IS rows where there are no insertions.\n"
,
"default": "False"
"default":false
}
@@ -220,7 +213,7 @@
"description": "Type: `boolean_true`, default: `false`. Remove overlaps of paired-end reads from coverage and base count computations",
"help_text": "Type: `boolean_true`, default: `false`. Remove overlaps of paired-end reads from coverage and base count computations.\n"
,
"default": "False"
"default":false
}
@@ -228,10 +221,9 @@
"cov_threshold": {
"type":
"integer",
"description": "Type: `integer`, default: `0`. Only bases with coverage above this value will be included in the target percentage computation",
"help_text": "Type: `integer`, default: `0`. Only bases with coverage above this value will be included in the target percentage computation.\n"
,
"default": "0"
"description": "Type: `integer`, example: `0`. Only bases with coverage above this value will be included in the target percentage computation",
"help_text": "Type: `integer`, example: `0`. Only bases with coverage above this value will be included in the target percentage computation. Default: `0`.\n"
}
@@ -269,10 +261,10 @@
"output": {
"type":
"string",
"description": "Type: `file`, required, default: `$id.$key.output.txt`. Output file",
"help_text": "Type: `file`, required, default: `$id.$key.output.txt`. Output file.\n"
"description": "Type: `file`, required, default: `$id.$key.output.txt`, example: `out.txt`. Output file",
"help_text": "Type: `file`, required, default: `$id.$key.output.txt`, example: `out.txt`. Output file.\n"
,
"default": "$id.$key.output.txt"
"default":"$id.$key.output.txt"
}

View File

@@ -2,8 +2,9 @@ manifest {
name = 'star/star_align_reads'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v0.1.0'
version = 'main'
description = 'Aligns reads to a reference genome using STAR.\n'
author = 'Angela Oliveira Pisco, Robrecht Cannoodt'
}
process.container = 'nextflow/bash:latest'

View File

@@ -1,406 +0,0 @@
name: "pear"
version: "v0.1.0"
argument_groups:
- name: "Inputs"
arguments:
- type: "file"
name: "--forward_fastq"
alternatives:
- "-f"
description: "Forward paired-end FASTQ file"
info: null
example:
- "forward.fastq"
must_exist: true
create_parent: true
required: true
direction: "input"
multiple: false
multiple_sep: ";"
- type: "file"
name: "--reverse_fastq"
alternatives:
- "-r"
description: "Reverse paired-end FASTQ file"
info: null
example:
- "reverse.fastq"
must_exist: true
create_parent: true
required: true
direction: "input"
multiple: false
multiple_sep: ";"
- name: "Outputs"
arguments:
- type: "file"
name: "--assembled"
description: "The output file containing assembled reads. Can be compressed with\
\ gzip."
info: null
must_exist: true
create_parent: true
required: true
direction: "output"
multiple: false
multiple_sep: ";"
- type: "file"
name: "--unassembled_forward"
description: "The output file containing forward reads that could not be assembled.\
\ Can be compressed with gzip."
info: null
must_exist: true
create_parent: true
required: true
direction: "output"
multiple: false
multiple_sep: ";"
- type: "file"
name: "--unassembled_reverse"
description: "The output file containing reverse reads that could not be assembled.\
\ Can be compressed with gzip."
info: null
must_exist: true
create_parent: true
required: true
direction: "output"
multiple: false
multiple_sep: ";"
- type: "file"
name: "--discarded"
description: "The output file containing reads that were discarded due to too\
\ low quality or too many uncalled bases. Can be compressed with gzip."
info: null
must_exist: true
create_parent: true
required: true
direction: "output"
multiple: false
multiple_sep: ";"
- name: "Arguments"
arguments:
- type: "double"
name: "--p_value"
alternatives:
- "-p"
description: "Specify a p-value for the statistical test. If the computed p-value\
\ of a possible assembly exceeds the specified p-value then paired-end read\
\ will not be assembled. Valid options are: 0.0001, 0.001, 0.01, 0.05 and 1.0.\
\ Setting 1.0 disables the test.\n"
info: null
example:
- 0.01
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "integer"
name: "--min_overlap"
alternatives:
- "-v"
description: "Specify the minimum overlap size. The minimum overlap may be set\
\ to 1 when the statistical test is used. However, further restricting the minimum\
\ overlap size to a proper value may reduce false-positive assembles.\n"
info: null
example:
- 10
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "integer"
name: "--max_assembly_length"
alternatives:
- "-m"
description: "Specify the maximum possible length of the assembled sequences.\
\ Setting this value to 0 disables the restriction and assembled sequences may\
\ be arbitrary long.\n"
info: null
example:
- 0
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "integer"
name: "--min_assembly_length"
alternatives:
- "-n"
description: "Specify the minimum possible length of the assembled sequences.\
\ Setting this value to 0 disables the restriction and assembled sequences may\
\ be arbitrary short.\n"
info: null
example:
- 0
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "integer"
name: "--min_trim_length"
alternatives:
- "-t"
description: "Specify the minimum length of reads after trimming the low quality\
\ part (see option -q)\n"
info: null
example:
- 1
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "integer"
name: "--quality_threshold"
alternatives:
- "-q"
description: "Specify the quality threshold for trimming the low quality part\
\ of a read. If the quality scores of two consecutive bases are strictly less\
\ than the specified threshold, the rest of the read will be trimmed.\n"
info: null
example:
- 0
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "double"
name: "--max_uncalled_base"
alternatives:
- "-u"
description: "Specify the maximal proportion of uncalled bases in a read. Setting\
\ this value to 0 will cause PEAR to discard all reads containing uncalled bases.\
\ The other extreme setting is 1 which causes PEAR to process all reads independent\
\ on the number of uncalled bases.\n"
info: null
example:
- 1.0
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "integer"
name: "--test_method"
alternatives:
- "-g"
description: "Specify the type of statistical test. Two options are available.\
\ 1: Given the minimum allowed overlap, test using the highest OES. Note that\
\ due to its discrete nature, this test usually yields a lower p-value for the\
\ assembled read than the cut- off (specified by -p). For example, setting the\
\ cut-off to 0.05 using this test, the assembled reads might have an actual\
\ p-value of 0.02.\n2. Use the acceptance probability (m.a.p). This test methods\
\ computes the same probability as test method 1. However, it assumes that the\
\ minimal overlap is the observed overlap with the highest OES, instead of the\
\ one specified by -v. Therefore, this is not a valid statistical test and the\
\ 'p-value' is in fact the maximal probability for accepting the assembly. Nevertheless,\
\ we observed in practice that for the case the actual overlap sizes are relatively\
\ small, test 2 can correctly assemble more reads with only slightly higher\
\ false-positive rate.\n"
info: null
example:
- 1
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "boolean_true"
name: "--emperical_freqs"
alternatives:
- "-e"
description: "Disable empirical base frequencies.\n"
info: null
direction: "input"
- type: "integer"
name: "--score_method"
alternatives:
- "-s"
description: "Specify the scoring method. 1. OES with +1 for match and -1 for\
\ mismatch. 2: Assembly score (AS). Use +1 for match and -1 for mismatch multiplied\
\ by base quality scores. 3: Ignore quality scores and use +1 for a match and\
\ -1 for a mismatch.\n"
info: null
example:
- 2
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "integer"
name: "--phred_base"
alternatives:
- "-b"
description: "Base PHRED quality score.\n"
info: null
example:
- 33
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "integer"
name: "--cap"
alternatives:
- "-c"
description: "Specify the upper bound for the resulting quality score. If set\
\ to zero, capping is disabled.\n"
info: null
example:
- 40
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "boolean_true"
name: "--nbase"
alternatives:
- "-z"
description: "When merging a base-pair that consists of two non-equal bases out\
\ of which none is degenerate, set the merged base to N and use the highest\
\ quality score of the two bases\n"
info: null
direction: "input"
resources:
- type: "bash_script"
path: "script.sh"
is_executable: true
description: "PEAR is an ultrafast, memory-efficient and highly accurate pair-end\
\ read merger. It is fully parallelized and can run with as low as just a few kilobytes\
\ of memory.\n\nPEAR evaluates all possible paired-end read overlaps and without\
\ requiring the target fragment size as input. In addition, it implements a statistical\
\ test for minimizing false-positive results. Together with a highly optimized implementation,\
\ it can merge millions of paired end reads within a couple of minutes on a standard\
\ desktop computer.\n"
test_resources:
- type: "bash_script"
path: "test.sh"
is_executable: true
- type: "file"
path: "test_data"
info: null
status: "enabled"
requirements:
commands:
- "ps"
keywords:
- "pair-end"
- "read"
- "merge"
license: "CC-BY-NC-SA-3.0"
references:
doi:
- "10.1093/bioinformatics/btt593"
links:
repository: "https://github.com/tseemann/PEAR"
homepage: "https://cme.h-its.org/exelixis/web/software/pear"
documentation: "https://cme.h-its.org/exelixis/web/software/pear/doc.html"
runners:
- type: "executable"
id: "executable"
docker_setup_strategy: "ifneedbepullelsecachedbuild"
- type: "nextflow"
id: "nextflow"
directives:
tag: "$id"
auto:
simplifyInput: true
simplifyOutput: false
transcript: false
publish: false
config:
labels:
mem1gb: "memory = 1000000000.B"
mem2gb: "memory = 2000000000.B"
mem5gb: "memory = 5000000000.B"
mem10gb: "memory = 10000000000.B"
mem20gb: "memory = 20000000000.B"
mem50gb: "memory = 50000000000.B"
mem100gb: "memory = 100000000000.B"
mem200gb: "memory = 200000000000.B"
mem500gb: "memory = 500000000000.B"
mem1tb: "memory = 1000000000000.B"
mem2tb: "memory = 2000000000000.B"
mem5tb: "memory = 5000000000000.B"
mem10tb: "memory = 10000000000000.B"
mem20tb: "memory = 20000000000000.B"
mem50tb: "memory = 50000000000000.B"
mem100tb: "memory = 100000000000000.B"
mem200tb: "memory = 200000000000000.B"
mem500tb: "memory = 500000000000000.B"
mem1gib: "memory = 1073741824.B"
mem2gib: "memory = 2147483648.B"
mem4gib: "memory = 4294967296.B"
mem8gib: "memory = 8589934592.B"
mem16gib: "memory = 17179869184.B"
mem32gib: "memory = 34359738368.B"
mem64gib: "memory = 68719476736.B"
mem128gib: "memory = 137438953472.B"
mem256gib: "memory = 274877906944.B"
mem512gib: "memory = 549755813888.B"
mem1tib: "memory = 1099511627776.B"
mem2tib: "memory = 2199023255552.B"
mem4tib: "memory = 4398046511104.B"
mem8tib: "memory = 8796093022208.B"
mem16tib: "memory = 17592186044416.B"
mem32tib: "memory = 35184372088832.B"
mem64tib: "memory = 70368744177664.B"
mem128tib: "memory = 140737488355328.B"
mem256tib: "memory = 281474976710656.B"
mem512tib: "memory = 562949953421312.B"
cpu1: "cpus = 1"
cpu2: "cpus = 2"
cpu5: "cpus = 5"
cpu10: "cpus = 10"
cpu20: "cpus = 20"
cpu50: "cpus = 50"
cpu100: "cpus = 100"
cpu200: "cpus = 200"
cpu500: "cpus = 500"
cpu1000: "cpus = 1000"
debug: false
container: "docker"
engines:
- type: "docker"
id: "docker"
image: "quay.io/biocontainers/pear:0.9.6--h9d449c0_10"
target_registry: "images.viash-hub.com"
target_tag: "v0.1.0"
namespace_separator: "/"
setup:
- type: "docker"
run:
- "version=$(pear -h | grep 'PEAR v' | sed 's/PEAR v//' | sed 's/ .*//') && \\\
\necho \"pear: $version\" > /var/software_versions.txt\n"
entrypoint: []
cmd: null
- type: "native"
id: "native"
build_info:
config: "src/pear/config.vsh.yaml"
runner: "nextflow"
engine: "docker|native"
output: "target/nextflow/pear"
executable: "target/nextflow/pear/main.nf"
viash_version: "0.9.0-RC6"
git_commit: "b84b29747d0635f2ac83ea63b496be9a9edb6724"
git_remote: "https://github.com/viash-hub/biobox"
package_config:
name: "biobox"
version: "v0.1.0"
description: "A collection of bioinformatics tools for working with sequence data.\n"
info: null
viash_version: "0.9.0-RC6"
source: "src"
target: "target"
config_mods:
- ".requirements.commands := ['ps']\n"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v0.1.0'"
keywords:
- "bioinformatics"
- "modules"
- "sequencing"
license: "MIT"
organization: "vsh"
links:
repository: "https://github.com/viash-hub/biobox"
issue_tracker: "https://github.com/viash-hub/biobox/issues"

File diff suppressed because it is too large Load Diff

View File

@@ -1,125 +0,0 @@
manifest {
name = 'pear'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v0.1.0'
description = 'PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory.\n\nPEAR evaluates all possible paired-end read overlaps and without requiring the target fragment size as input. In addition, it implements a statistical test for minimizing false-positive results. Together with a highly optimized implementation, it can merge millions of paired end reads within a couple of minutes on a standard desktop computer.\n'
}
process.container = 'nextflow/bash:latest'
// detect tempdir
tempDir = java.nio.file.Paths.get(
System.getenv('NXF_TEMP') ?:
System.getenv('VIASH_TEMP') ?:
System.getenv('TEMPDIR') ?:
System.getenv('TMPDIR') ?:
'/tmp'
).toAbsolutePath()
profiles {
no_publish {
process {
withName: '.*' {
publishDir = [
enabled: false
]
}
}
}
mount_temp {
docker.temp = tempDir
podman.temp = tempDir
charliecloud.temp = tempDir
}
docker {
docker.enabled = true
// docker.userEmulation = true
singularity.enabled = false
podman.enabled = false
shifter.enabled = false
charliecloud.enabled = false
}
singularity {
singularity.enabled = true
singularity.autoMounts = true
docker.enabled = false
podman.enabled = false
shifter.enabled = false
charliecloud.enabled = false
}
podman {
podman.enabled = true
docker.enabled = false
singularity.enabled = false
shifter.enabled = false
charliecloud.enabled = false
}
shifter {
shifter.enabled = true
docker.enabled = false
singularity.enabled = false
podman.enabled = false
charliecloud.enabled = false
}
charliecloud {
charliecloud.enabled = true
docker.enabled = false
singularity.enabled = false
podman.enabled = false
shifter.enabled = false
}
}
process{
withLabel: mem1gb { memory = 1000000000.B }
withLabel: mem2gb { memory = 2000000000.B }
withLabel: mem5gb { memory = 5000000000.B }
withLabel: mem10gb { memory = 10000000000.B }
withLabel: mem20gb { memory = 20000000000.B }
withLabel: mem50gb { memory = 50000000000.B }
withLabel: mem100gb { memory = 100000000000.B }
withLabel: mem200gb { memory = 200000000000.B }
withLabel: mem500gb { memory = 500000000000.B }
withLabel: mem1tb { memory = 1000000000000.B }
withLabel: mem2tb { memory = 2000000000000.B }
withLabel: mem5tb { memory = 5000000000000.B }
withLabel: mem10tb { memory = 10000000000000.B }
withLabel: mem20tb { memory = 20000000000000.B }
withLabel: mem50tb { memory = 50000000000000.B }
withLabel: mem100tb { memory = 100000000000000.B }
withLabel: mem200tb { memory = 200000000000000.B }
withLabel: mem500tb { memory = 500000000000000.B }
withLabel: mem1gib { memory = 1073741824.B }
withLabel: mem2gib { memory = 2147483648.B }
withLabel: mem4gib { memory = 4294967296.B }
withLabel: mem8gib { memory = 8589934592.B }
withLabel: mem16gib { memory = 17179869184.B }
withLabel: mem32gib { memory = 34359738368.B }
withLabel: mem64gib { memory = 68719476736.B }
withLabel: mem128gib { memory = 137438953472.B }
withLabel: mem256gib { memory = 274877906944.B }
withLabel: mem512gib { memory = 549755813888.B }
withLabel: mem1tib { memory = 1099511627776.B }
withLabel: mem2tib { memory = 2199023255552.B }
withLabel: mem4tib { memory = 4398046511104.B }
withLabel: mem8tib { memory = 8796093022208.B }
withLabel: mem16tib { memory = 17592186044416.B }
withLabel: mem32tib { memory = 35184372088832.B }
withLabel: mem64tib { memory = 70368744177664.B }
withLabel: mem128tib { memory = 140737488355328.B }
withLabel: mem256tib { memory = 281474976710656.B }
withLabel: mem512tib { memory = 562949953421312.B }
withLabel: cpu1 { cpus = 1 }
withLabel: cpu2 { cpus = 2 }
withLabel: cpu5 { cpus = 5 }
withLabel: cpu10 { cpus = 10 }
withLabel: cpu20 { cpus = 20 }
withLabel: cpu50 { cpus = 50 }
withLabel: cpu100 { cpus = 100 }
withLabel: cpu200 { cpus = 200 }
withLabel: cpu500 { cpus = 500 }
withLabel: cpu1000 { cpus = 1000 }
}

View File

@@ -1,284 +0,0 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"title": "pear",
"description": "PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory.\n\nPEAR evaluates all possible paired-end read overlaps and without requiring the target fragment size as input. In addition, it implements a statistical test for minimizing false-positive results. Together with a highly optimized implementation, it can merge millions of paired end reads within a couple of minutes on a standard desktop computer.\n",
"type": "object",
"definitions": {
"inputs" : {
"title": "Inputs",
"type": "object",
"description": "No description",
"properties": {
"forward_fastq": {
"type":
"string",
"description": "Type: `file`, required, example: `forward.fastq`. Forward paired-end FASTQ file",
"help_text": "Type: `file`, required, example: `forward.fastq`. Forward paired-end FASTQ file"
}
,
"reverse_fastq": {
"type":
"string",
"description": "Type: `file`, required, example: `reverse.fastq`. Reverse paired-end FASTQ file",
"help_text": "Type: `file`, required, example: `reverse.fastq`. Reverse paired-end FASTQ file"
}
}
},
"outputs" : {
"title": "Outputs",
"type": "object",
"description": "No description",
"properties": {
"assembled": {
"type":
"string",
"description": "Type: `file`, required, default: `$id.$key.assembled.assembled`. The output file containing assembled reads",
"help_text": "Type: `file`, required, default: `$id.$key.assembled.assembled`. The output file containing assembled reads. Can be compressed with gzip."
,
"default": "$id.$key.assembled.assembled"
}
,
"unassembled_forward": {
"type":
"string",
"description": "Type: `file`, required, default: `$id.$key.unassembled_forward.unassembled_forward`. The output file containing forward reads that could not be assembled",
"help_text": "Type: `file`, required, default: `$id.$key.unassembled_forward.unassembled_forward`. The output file containing forward reads that could not be assembled. Can be compressed with gzip."
,
"default": "$id.$key.unassembled_forward.unassembled_forward"
}
,
"unassembled_reverse": {
"type":
"string",
"description": "Type: `file`, required, default: `$id.$key.unassembled_reverse.unassembled_reverse`. The output file containing reverse reads that could not be assembled",
"help_text": "Type: `file`, required, default: `$id.$key.unassembled_reverse.unassembled_reverse`. The output file containing reverse reads that could not be assembled. Can be compressed with gzip."
,
"default": "$id.$key.unassembled_reverse.unassembled_reverse"
}
,
"discarded": {
"type":
"string",
"description": "Type: `file`, required, default: `$id.$key.discarded.discarded`. The output file containing reads that were discarded due to too low quality or too many uncalled bases",
"help_text": "Type: `file`, required, default: `$id.$key.discarded.discarded`. The output file containing reads that were discarded due to too low quality or too many uncalled bases. Can be compressed with gzip."
,
"default": "$id.$key.discarded.discarded"
}
}
},
"arguments" : {
"title": "Arguments",
"type": "object",
"description": "No description",
"properties": {
"p_value": {
"type":
"number",
"description": "Type: `double`, example: `0.01`. Specify a p-value for the statistical test",
"help_text": "Type: `double`, example: `0.01`. Specify a p-value for the statistical test. If the computed p-value of a possible assembly exceeds the specified p-value then paired-end read will not be assembled. Valid options are: 0.0001, 0.001, 0.01, 0.05 and 1.0. Setting 1.0 disables the test.\n"
}
,
"min_overlap": {
"type":
"integer",
"description": "Type: `integer`, example: `10`. Specify the minimum overlap size",
"help_text": "Type: `integer`, example: `10`. Specify the minimum overlap size. The minimum overlap may be set to 1 when the statistical test is used. However, further restricting the minimum overlap size to a proper value may reduce false-positive assembles.\n"
}
,
"max_assembly_length": {
"type":
"integer",
"description": "Type: `integer`, example: `0`. Specify the maximum possible length of the assembled sequences",
"help_text": "Type: `integer`, example: `0`. Specify the maximum possible length of the assembled sequences. Setting this value to 0 disables the restriction and assembled sequences may be arbitrary long.\n"
}
,
"min_assembly_length": {
"type":
"integer",
"description": "Type: `integer`, example: `0`. Specify the minimum possible length of the assembled sequences",
"help_text": "Type: `integer`, example: `0`. Specify the minimum possible length of the assembled sequences. Setting this value to 0 disables the restriction and assembled sequences may be arbitrary short.\n"
}
,
"min_trim_length": {
"type":
"integer",
"description": "Type: `integer`, example: `1`. Specify the minimum length of reads after trimming the low quality part (see option -q)\n",
"help_text": "Type: `integer`, example: `1`. Specify the minimum length of reads after trimming the low quality part (see option -q)\n"
}
,
"quality_threshold": {
"type":
"integer",
"description": "Type: `integer`, example: `0`. Specify the quality threshold for trimming the low quality part of a read",
"help_text": "Type: `integer`, example: `0`. Specify the quality threshold for trimming the low quality part of a read. If the quality scores of two consecutive bases are strictly less than the specified threshold, the rest of the read will be trimmed.\n"
}
,
"max_uncalled_base": {
"type":
"number",
"description": "Type: `double`, example: `1.0`. Specify the maximal proportion of uncalled bases in a read",
"help_text": "Type: `double`, example: `1.0`. Specify the maximal proportion of uncalled bases in a read. Setting this value to 0 will cause PEAR to discard all reads containing uncalled bases. The other extreme setting is 1 which causes PEAR to process all reads independent on the number of uncalled bases.\n"
}
,
"test_method": {
"type":
"integer",
"description": "Type: `integer`, example: `1`. Specify the type of statistical test",
"help_text": "Type: `integer`, example: `1`. Specify the type of statistical test. Two options are available. 1: Given the minimum allowed overlap, test using the highest OES. Note that due to its discrete nature, this test usually yields a lower p-value for the assembled read than the cut- off (specified by -p). For example, setting the cut-off to 0.05 using this test, the assembled reads might have an actual p-value of 0.02.\n2. Use the acceptance probability (m.a.p). This test methods computes the same probability as test method 1. However, it assumes that the minimal overlap is the observed overlap with the highest OES, instead of the one specified by -v. Therefore, this is not a valid statistical test and the \u0027p-value\u0027 is in fact the maximal probability for accepting the assembly. Nevertheless, we observed in practice that for the case the actual overlap sizes are relatively small, test 2 can correctly assemble more reads with only slightly higher false-positive rate.\n"
}
,
"emperical_freqs": {
"type":
"boolean",
"description": "Type: `boolean_true`, default: `false`. Disable empirical base frequencies",
"help_text": "Type: `boolean_true`, default: `false`. Disable empirical base frequencies.\n"
,
"default": "False"
}
,
"score_method": {
"type":
"integer",
"description": "Type: `integer`, example: `2`. Specify the scoring method",
"help_text": "Type: `integer`, example: `2`. Specify the scoring method. 1. OES with +1 for match and -1 for mismatch. 2: Assembly score (AS). Use +1 for match and -1 for mismatch multiplied by base quality scores. 3: Ignore quality scores and use +1 for a match and -1 for a mismatch.\n"
}
,
"phred_base": {
"type":
"integer",
"description": "Type: `integer`, example: `33`. Base PHRED quality score",
"help_text": "Type: `integer`, example: `33`. Base PHRED quality score.\n"
}
,
"cap": {
"type":
"integer",
"description": "Type: `integer`, example: `40`. Specify the upper bound for the resulting quality score",
"help_text": "Type: `integer`, example: `40`. Specify the upper bound for the resulting quality score. If set to zero, capping is disabled.\n"
}
,
"nbase": {
"type":
"boolean",
"description": "Type: `boolean_true`, default: `false`. When merging a base-pair that consists of two non-equal bases out of which none is degenerate, set the merged base to N and use the highest quality score of the two bases\n",
"help_text": "Type: `boolean_true`, default: `false`. When merging a base-pair that consists of two non-equal bases out of which none is degenerate, set the merged base to N and use the highest quality score of the two bases\n"
,
"default": "False"
}
}
},
"nextflow input-output arguments" : {
"title": "Nextflow input-output arguments",
"type": "object",
"description": "Input/output parameters for Nextflow itself. Please note that both publishDir and publish_dir are supported but at least one has to be configured.",
"properties": {
"publish_dir": {
"type":
"string",
"description": "Type: `string`, required, example: `output/`. Path to an output directory",
"help_text": "Type: `string`, required, example: `output/`. Path to an output directory."
}
,
"param_list": {
"type":
"string",
"description": "Type: `string`, example: `my_params.yaml`. Allows inputting multiple parameter sets to initialise a Nextflow channel",
"help_text": "Type: `string`, example: `my_params.yaml`. Allows inputting multiple parameter sets to initialise a Nextflow channel. A `param_list` can either be a list of maps, a csv file, a json file, a yaml file, or simply a yaml blob.\n\n* A list of maps (as-is) where the keys of each map corresponds to the arguments of the pipeline. Example: in a `nextflow.config` file: `param_list: [ [\u0027id\u0027: \u0027foo\u0027, \u0027input\u0027: \u0027foo.txt\u0027], [\u0027id\u0027: \u0027bar\u0027, \u0027input\u0027: \u0027bar.txt\u0027] ]`.\n* A csv file should have column names which correspond to the different arguments of this pipeline. Example: `--param_list data.csv` with columns `id,input`.\n* A json or a yaml file should be a list of maps, each of which has keys corresponding to the arguments of the pipeline. Example: `--param_list data.json` with contents `[ {\u0027id\u0027: \u0027foo\u0027, \u0027input\u0027: \u0027foo.txt\u0027}, {\u0027id\u0027: \u0027bar\u0027, \u0027input\u0027: \u0027bar.txt\u0027} ]`.\n* A yaml blob can also be passed directly as a string. Example: `--param_list \"[ {\u0027id\u0027: \u0027foo\u0027, \u0027input\u0027: \u0027foo.txt\u0027}, {\u0027id\u0027: \u0027bar\u0027, \u0027input\u0027: \u0027bar.txt\u0027} ]\"`.\n\nWhen passing a csv, json or yaml file, relative path names are relativized to the location of the parameter file. No relativation is performed when `param_list` is a list of maps (as-is) or a yaml blob.",
"hidden": true
}
}
}
},
"allOf": [
{
"$ref": "#/definitions/inputs"
},
{
"$ref": "#/definitions/outputs"
},
{
"$ref": "#/definitions/arguments"
},
{
"$ref": "#/definitions/nextflow input-output arguments"
}
]
}

View File

@@ -1,171 +0,0 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"title": "star_align_reads",
"description": "Aligns reads to a reference genome using STAR.\n",
"type": "object",
"definitions": {
"inputs" : {
"title": "Inputs",
"type": "object",
"description": "No description",
"properties": {
"input": {
"type":
"string",
"description": "Type: List of `file`, required, example: `mysample_S1_L001_R1_001.fastq.gz`, multiple_sep: `\":\"`. The single-end or paired-end R1 FastQ files to be processed",
"help_text": "Type: List of `file`, required, example: `mysample_S1_L001_R1_001.fastq.gz`, multiple_sep: `\":\"`. The single-end or paired-end R1 FastQ files to be processed."
}
,
"input_r2": {
"type":
"string",
"description": "Type: List of `file`, example: `mysample_S1_L001_R2_001.fastq.gz`, multiple_sep: `\":\"`. The paired-end R2 FastQ files to be processed",
"help_text": "Type: List of `file`, example: `mysample_S1_L001_R2_001.fastq.gz`, multiple_sep: `\":\"`. The paired-end R2 FastQ files to be processed. Only required if --input is a paired-end R1 file."
}
}
},
"outputs" : {
"title": "Outputs",
"type": "object",
"description": "No description",
"properties": {
"aligned_reads": {
"type":
"string",
"description": "Type: `file`, required, default: `$id.$key.aligned_reads.bam`, example: `aligned_reads.bam`. The output file containing the aligned reads",
"help_text": "Type: `file`, required, default: `$id.$key.aligned_reads.bam`, example: `aligned_reads.bam`. The output file containing the aligned reads."
,
"default": "$id.$key.aligned_reads.bam"
}
,
"reads_per_gene": {
"type":
"string",
"description": "Type: `file`, default: `$id.$key.reads_per_gene.tsv`, example: `reads_per_gene.tsv`. The output file containing the number of reads per gene",
"help_text": "Type: `file`, default: `$id.$key.reads_per_gene.tsv`, example: `reads_per_gene.tsv`. The output file containing the number of reads per gene."
,
"default": "$id.$key.reads_per_gene.tsv"
}
,
"unmapped": {
"type":
"string",
"description": "Type: `file`, default: `$id.$key.unmapped.fastq`, example: `unmapped.fastq`. The output file containing the unmapped reads",
"help_text": "Type: `file`, default: `$id.$key.unmapped.fastq`, example: `unmapped.fastq`. The output file containing the unmapped reads."
,
"default": "$id.$key.unmapped.fastq"
}
,
"unmapped_r2": {
"type":
"string",
"description": "Type: `file`, default: `$id.$key.unmapped_r2.fastq`, example: `unmapped_r2.fastq`. The output file containing the unmapped R2 reads",
"help_text": "Type: `file`, default: `$id.$key.unmapped_r2.fastq`, example: `unmapped_r2.fastq`. The output file containing the unmapped R2 reads."
,
"default": "$id.$key.unmapped_r2.fastq"
}
,
"chimeric_junctions": {
"type":
"string",
"description": "Type: `file`, default: `$id.$key.chimeric_junctions.tsv`, example: `chimeric_junctions.tsv`. The output file containing the chimeric junctions",
"help_text": "Type: `file`, default: `$id.$key.chimeric_junctions.tsv`, example: `chimeric_junctions.tsv`. The output file containing the chimeric junctions."
,
"default": "$id.$key.chimeric_junctions.tsv"
}
,
"log": {
"type":
"string",
"description": "Type: `file`, default: `$id.$key.log.txt`, example: `log.txt`. The output file containing the log of the alignment process",
"help_text": "Type: `file`, default: `$id.$key.log.txt`, example: `log.txt`. The output file containing the log of the alignment process."
,
"default": "$id.$key.log.txt"
}
,
"splice_junctions": {
"type":
"string",
"description": "Type: `file`, default: `$id.$key.splice_junctions.tsv`, example: `splice_junctions.tsv`. The output file containing the splice junctions",
"help_text": "Type: `file`, default: `$id.$key.splice_junctions.tsv`, example: `splice_junctions.tsv`. The output file containing the splice junctions."
,
"default": "$id.$key.splice_junctions.tsv"
}
}
},
"nextflow input-output arguments" : {
"title": "Nextflow input-output arguments",
"type": "object",
"description": "Input/output parameters for Nextflow itself. Please note that both publishDir and publish_dir are supported but at least one has to be configured.",
"properties": {
"publish_dir": {
"type":
"string",
"description": "Type: `string`, required, example: `output/`. Path to an output directory",
"help_text": "Type: `string`, required, example: `output/`. Path to an output directory."
}
,
"param_list": {
"type":
"string",
"description": "Type: `string`, example: `my_params.yaml`. Allows inputting multiple parameter sets to initialise a Nextflow channel",
"help_text": "Type: `string`, example: `my_params.yaml`. Allows inputting multiple parameter sets to initialise a Nextflow channel. A `param_list` can either be a list of maps, a csv file, a json file, a yaml file, or simply a yaml blob.\n\n* A list of maps (as-is) where the keys of each map corresponds to the arguments of the pipeline. Example: in a `nextflow.config` file: `param_list: [ [\u0027id\u0027: \u0027foo\u0027, \u0027input\u0027: \u0027foo.txt\u0027], [\u0027id\u0027: \u0027bar\u0027, \u0027input\u0027: \u0027bar.txt\u0027] ]`.\n* A csv file should have column names which correspond to the different arguments of this pipeline. Example: `--param_list data.csv` with columns `id,input`.\n* A json or a yaml file should be a list of maps, each of which has keys corresponding to the arguments of the pipeline. Example: `--param_list data.json` with contents `[ {\u0027id\u0027: \u0027foo\u0027, \u0027input\u0027: \u0027foo.txt\u0027}, {\u0027id\u0027: \u0027bar\u0027, \u0027input\u0027: \u0027bar.txt\u0027} ]`.\n* A yaml blob can also be passed directly as a string. Example: `--param_list \"[ {\u0027id\u0027: \u0027foo\u0027, \u0027input\u0027: \u0027foo.txt\u0027}, {\u0027id\u0027: \u0027bar\u0027, \u0027input\u0027: \u0027bar.txt\u0027} ]\"`.\n\nWhen passing a csv, json or yaml file, relative path names are relativized to the location of the parameter file. No relativation is performed when `param_list` is a list of maps (as-is) or a yaml blob.",
"hidden": true
}
}
}
},
"allOf": [
{
"$ref": "#/definitions/inputs"
},
{
"$ref": "#/definitions/outputs"
},
{
"$ref": "#/definitions/nextflow input-output arguments"
}
]
}

View File

@@ -59,37 +59,32 @@ dependencies:
repository:
type: "vsh"
repo: "vsh/biobox"
tag: "v0.1"
- name: "pear"
repository:
type: "vsh"
repo: "vsh/biobox"
tag: "v0.1"
tag: "main"
- name: "falco"
repository:
type: "vsh"
repo: "vsh/biobox"
tag: "v0.1"
tag: "main"
- name: "multiqc"
repository:
type: "vsh"
repo: "vsh/biobox"
tag: "v0.1"
tag: "main"
- name: "star/star_align_reads"
repository:
type: "vsh"
repo: "vsh/biobox"
tag: "v0.1"
tag: "main"
- name: "samtools/samtools_stats"
repository:
type: "vsh"
repo: "vsh/biobox"
tag: "v0.1"
tag: "main"
repositories:
- type: "vsh"
name: "bb"
repo: "vsh/biobox"
tag: "v0.1"
tag: "main"
license: "MIT"
links:
repository: "https://github.com/viash-hub/playground"
@@ -167,15 +162,14 @@ build_info:
output: "target/nextflow/mapping_and_qc"
executable: "target/nextflow/mapping_and_qc/main.nf"
viash_version: "0.9.0-RC6"
git_commit: "ec3c23e349a796e57e6634e140325afde4515bbf"
git_commit: "861319f5ddc22d51899493bcd30e7066c42193cb"
git_remote: "https://github.com/viash-hub/playground"
dependencies:
- "target/dependencies/vsh/vsh/biobox/v0.1/nextflow/cutadapt"
- "target/dependencies/vsh/vsh/biobox/v0.1/nextflow/pear"
- "target/dependencies/vsh/vsh/biobox/v0.1/nextflow/falco"
- "target/dependencies/vsh/vsh/biobox/v0.1/nextflow/multiqc"
- "target/dependencies/vsh/vsh/biobox/v0.1/nextflow/star/star_align_reads"
- "target/dependencies/vsh/vsh/biobox/v0.1/nextflow/samtools/samtools_stats"
- "target/dependencies/vsh/vsh/biobox/main/nextflow/cutadapt"
- "target/dependencies/vsh/vsh/biobox/main/nextflow/falco"
- "target/dependencies/vsh/vsh/biobox/main/nextflow/multiqc"
- "target/dependencies/vsh/vsh/biobox/main/nextflow/star/star_align_reads"
- "target/dependencies/vsh/vsh/biobox/main/nextflow/samtools/samtools_stats"
package_config:
name: "playground"
version: "main"

View File

@@ -2856,15 +2856,7 @@ meta = [
"repository" : {
"type" : "vsh",
"repo" : "vsh/biobox",
"tag" : "v0.1"
}
},
{
"name" : "pear",
"repository" : {
"type" : "vsh",
"repo" : "vsh/biobox",
"tag" : "v0.1"
"tag" : "main"
}
},
{
@@ -2872,7 +2864,7 @@ meta = [
"repository" : {
"type" : "vsh",
"repo" : "vsh/biobox",
"tag" : "v0.1"
"tag" : "main"
}
},
{
@@ -2880,7 +2872,7 @@ meta = [
"repository" : {
"type" : "vsh",
"repo" : "vsh/biobox",
"tag" : "v0.1"
"tag" : "main"
}
},
{
@@ -2888,7 +2880,7 @@ meta = [
"repository" : {
"type" : "vsh",
"repo" : "vsh/biobox",
"tag" : "v0.1"
"tag" : "main"
}
},
{
@@ -2896,7 +2888,7 @@ meta = [
"repository" : {
"type" : "vsh",
"repo" : "vsh/biobox",
"tag" : "v0.1"
"tag" : "main"
}
}
],
@@ -2905,7 +2897,7 @@ meta = [
"type" : "vsh",
"name" : "bb",
"repo" : "vsh/biobox",
"tag" : "v0.1"
"tag" : "main"
}
],
"license" : "MIT",
@@ -2997,7 +2989,7 @@ meta = [
"engine" : "native|native",
"output" : "target/nextflow/mapping_and_qc",
"viash_version" : "0.9.0-RC6",
"git_commit" : "ec3c23e349a796e57e6634e140325afde4515bbf",
"git_commit" : "861319f5ddc22d51899493bcd30e7066c42193cb",
"git_remote" : "https://github.com/viash-hub/playground"
},
"package_config" : {
@@ -3033,12 +3025,11 @@ meta = [
// resolve dependencies dependencies (if any)
meta["root_dir"] = getRootDir()
include { cutadapt } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/v0.1/nextflow/cutadapt/main.nf"
include { pear } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/v0.1/nextflow/pear/main.nf"
include { falco } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/v0.1/nextflow/falco/main.nf"
include { multiqc } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/v0.1/nextflow/multiqc/main.nf"
include { star_align_reads } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/v0.1/nextflow/star/star_align_reads/main.nf"
include { samtools_stats } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/v0.1/nextflow/samtools/samtools_stats/main.nf"
include { cutadapt } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/main/nextflow/cutadapt/main.nf"
include { falco } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/main/nextflow/falco/main.nf"
include { multiqc } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/main/nextflow/multiqc/main.nf"
include { star_align_reads } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/main/nextflow/star/star_align_reads/main.nf"
include { samtools_stats } from "${meta.root_dir}/dependencies/vsh/vsh/biobox/main/nextflow/samtools/samtools_stats/main.nf"
// inner workflow
// user-provided Nextflow code
@@ -3061,14 +3052,18 @@ workflow run_wf {
},
toState: [
"output_falco": "outdir",
]
],
directives: [label: ["lowmem", "lowcpu"]]
)
| niceView()
| cutadapt.run(
fromState: {id, state ->
[
"input": state.input_r1,
"input_r2": state.input_r2,
"quality_cutoff": "20", // Could make this a parameter
"quality_cutoff": "30", // Could make this a parameter
"quality_cutoff_r2": "30", // Could make this a parameter
"minimum_length": "60:60", // Could make this a parameter
"adapter": "CTGTCTCTTATACACATCT", // Could make this a parameter
"adapter_r2": "CTGTCTCTTATACACATCT", // Could make this a parameter
"output": "*.fastq",
@@ -3078,29 +3073,23 @@ workflow run_wf {
def newKeys = [
"trimmed_r1": output_state["output"][0],
"trimmed_r2": output_state["output"][1],
"output_cutadapt": output_state["output"]
]
def new_state = state + newKeys
return new_state
}
)
| pear.run(
fromState: [
"forward_fastq": "trimmed_r1",
"reverse_fastq": "trimmed_r2",
],
toState: [
"output_pear": "assembled",
]
},
directives: [label: ["midmem", "midmem"]]
)
| star_align_reads.run(
fromState: [
"input": "output_pear",
"genomeDir": "reference",
"input": "trimmed_r1",
"input_r2": "trimmed_r2",
"genome_dir": "reference",
],
toState: [
"output_star": "aligned_reads",
]
],
directives: [label: ["highmem", "midcpu"]]
)
| samtools_stats.run(
fromState: [
@@ -3108,7 +3097,9 @@ workflow run_wf {
],
toState: [
"output_samtools_stats": "output",
]
],
directives: [label: ["midmem", "lowcpu"]]
)
| toSortedList()
| map { events ->
@@ -3125,7 +3116,9 @@ workflow run_wf {
],
toState: [
"multiqc_output": "output_report",
]
],
directives: [label: ["midmem", "lowcpu"]]
)
| setState(["multiqc_output", "_meta"])

View File

@@ -60,7 +60,7 @@
"description": "Type: `file`, required, default: `$id.$key.multiqc_output.html`, example: `multiqc.html`. ",
"help_text": "Type: `file`, required, default: `$id.$key.multiqc_output.html`, example: `multiqc.html`. "
,
"default": "$id.$key.multiqc_output.html"
"default":"$id.$key.multiqc_output.html"
}

View File

@@ -15,6 +15,11 @@ if [ ! -f "$TEST_DATA_DIR/SRR1570800_1.fastq" ] || [ ! -f "$TEST_DATA_DIR/SRR157
docker run -t --rm -v $PWD:/output:rw -w /output/test_data ncbi/sra-tools fasterq-dump -e 2 -p SRR1570800
fi
head -n 10000 "$TEST_DATA_DIR/SRR1569895_1.fastq" > "$TEST_DATA_DIR/SRR1569895_1_subsample.fastq"
head -n 10000 "$TEST_DATA_DIR/SRR1569895_2.fastq" > "$TEST_DATA_DIR/SRR1569895_2_subsample.fastq"
head -n 10000 "$TEST_DATA_DIR/SRR1570800_1.fastq" > "$TEST_DATA_DIR/SRR1570800_1_subsample.fastq"
head -n 10000 "$TEST_DATA_DIR/SRR1570800_2.fastq" > "$TEST_DATA_DIR/SRR1570800_2_subsample.fastq"
export NXF_SCM_FILE="$TEST_DATA_DIR/scm.config"
cat > $NXF_SCM_FILE << EOF
@@ -39,8 +44,8 @@ if [ ! -d "$TEST_DATA_DIR/S288C_reference_genome_Current_Release" ]; then
--publish_dir "$TEST_DATA_DIR"
fi
zcat "$TEST_DATA_DIR/S288C_reference_genome_Current_Release/S288C_reference_sequence_R64-5-1_20240529.fsa.gz" > "$TEST_DATA_DIR/S288C_reference_genome_Current_Release/S288C_reference_sequence_R64-5-1_20240529.fsa"
zcat "$TEST_DATA_DIR/S288C_reference_genome_Current_Release/saccharomyces_cerevisiae_R64-5-1_20240529.gff.gz" > "$TEST_DATA_DIR/S288C_reference_genome_Current_Release/saccharomyces_cerevisiae_R64-5-1_20240529.gff"
gunzip -c "$TEST_DATA_DIR/S288C_reference_genome_Current_Release/S288C_reference_sequence_R64-5-1_20240529.fsa.gz" > "$TEST_DATA_DIR/S288C_reference_genome_Current_Release/S288C_reference_sequence_R64-5-1_20240529.fsa"
gunzip -c "$TEST_DATA_DIR/S288C_reference_genome_Current_Release/saccharomyces_cerevisiae_R64-5-1_20240529.gff.gz" > "$TEST_DATA_DIR/S288C_reference_genome_Current_Release/saccharomyces_cerevisiae_R64-5-1_20240529.gff"
sed -i -e 's/^.*chromosome=\(.*\)\]$/>chr\1/' "$TEST_DATA_DIR/S288C_reference_genome_Current_Release/S288C_reference_sequence_R64-5-1_20240529.fsa"
if [ ! -d "$TEST_DATA_DIR/S288C_reference_genome_Current_Release_STAR" ]; then
@@ -59,11 +64,11 @@ PARAMS_FILE=params_file.yaml
cat > $PARAMS_FILE << EOF
param_list:
- id: SRR1569895
input_r1: $TEST_DATA_DIR/SRR1569895_1.fastq
input_r2: $TEST_DATA_DIR/SRR1569895_2.fastq
input_r1: $TEST_DATA_DIR/SRR1569895_1_subsample.fastq
input_r2: $TEST_DATA_DIR/SRR1569895_2_subsample.fastq
- id: SRR1570800
input_r1: $TEST_DATA_DIR/SRR1570800_1.fastq
input_r2: $TEST_DATA_DIR/SRR1570800_2.fastq
input_r1: $TEST_DATA_DIR/SRR1570800_1_subsample.fastq
input_r2: $TEST_DATA_DIR/SRR1570800_2_subsample.fastq
publish_dir: foo
reference: $TEST_DATA_DIR/S288C_reference_genome_Current_Release_STAR
EOF