Build branch openpipeline_spatial/niche-compass with version niche-compass to openpipeline_spatial on branch niche-compass (a4d8192)

Build pipeline: openpipelines-bio.openpipeline-spatial.niche-compass-mdfnj

Source commit: a4d81924a6

Source message: update base image, add cuda logging
This commit is contained in:
CI
2026-01-27 12:09:44 +00:00
parent 5304c707b7
commit 0c1b5c1ac9
376 changed files with 21682 additions and 8917 deletions

View File

@@ -2,7 +2,7 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.12.1
rev: v0.14.0
hooks:
- id: ruff-check
args: [ --fix ]
@@ -19,6 +19,6 @@ repos:
- styler
- knitr
- repo: https://github.com/lorenzwalthert/precommit
rev: v0.4.3.9012
rev: v0.4.3.9015
hooks:
- id: lintr

View File

@@ -1,9 +1,13 @@
# openpipeline_spatial x.x.x
# openpipeline_spatial 0.2.0
## NEW FUNCTIONALITY
* `neighbors/spatial_neighborhood_graph`: Calculate the spatial neighborhood graph (PR #29).
* `convert/from_spaceranger_to_h5mu`: Added converter component for convert Spaceranger output to H5MU files (PR #33).
* `workflows/ingestion/spaceranger_mapping`: Added a workflow to ingest Visium data using Spaceranger and convert the count matrix to an H5MU file (PR #33).
* `nichecompass/nichecompass`: Component to train a NicheCompass model and project latent space embeddings (PR #28).
* `workflows/niche/nichecompass_leiden`: Workflow to perform niche analysis using NicheCompass, including spatialneighborhood calculation, NicheCompass analysis and Leiden clustering (PR #28)
@@ -12,12 +16,18 @@
* Add `scope` to component and workflow configurations (PR #22).
* Bump version of spatialdata-io to 0.3.0 and spatialdata to 0.5.0. Pin version of pyarrow to 18.0.0 for compatibility (PR #24).
* `convert/from_xenium_to_spatialexperiment`: Add arrow with zstd codec support to handle I/O of zstd-compressed Xenium parquet files (PR #30).
* `mapping/spaceranger_count`: Allow providing individual FASTQ files instead of directories (PR #32).
* Bump anndata to 0.12.7 and mudata to 0.3.2 (PR #34).
* Bump spatialdata to 0.6.1 and spatialdata-io to 0.5.1 (PR #24, #34).
* Bump squidpy to 1.7.0 (PR #36).
* Update openpipeline dependencies to v4.0.0 (PR #37).
## BUG FIXES
* `convert/from_cosmx_to_h5mu`: Fixed an issue where parent directories of the cosmx output bundle were duplicated when reading in data (PR #25).

View File

@@ -10,7 +10,7 @@ repositories:
- name: openpipeline
repo: openpipeline
type: vsh
tag: v3.0.0
tag: v4.0.0
info:
test_resources:
- type: s3

View File

@@ -21,14 +21,34 @@ tar xvf "$DIR/Visium_FFPE_Human_Ovarian_Cancer_fastqs.tar" -C "$DIR"
# Create subsampled dataset with ImageMagick
# https://imagemagick.org/index.php
mkdir -p "$DIR/subsampled"
convert "$DIR/Visium_FFPE_Human_Ovarian_Cancer_image.jpg" -resize 2000x2000 "$DIR/subsampled/Visium_FFPE_Human_Ovarian_Cancer_image.jpg"
mkdir -p "$DIR/Visium_FFPE_Human_Ovarian_Cancer_tiny"
convert "$DIR/Visium_FFPE_Human_Ovarian_Cancer_image.jpg" -resize 2000x2000 "$DIR/Visium_FFPE_Human_Ovarian_Cancer_image_tiny.jpg"
for f in "$DIR"/Visium_FFPE_Human_Ovarian_Cancer_fastqs/*L001*R*; do
gzip -cdf "$f" | head -n 40000 | gzip -c > "$DIR/subsampled/$(basename "$f")";
gzip -cdf "$f" | head -n 40000 | gzip -c > "$DIR/Visium_FFPE_Human_Ovarian_Cancer_tiny/$(basename "$f")";
done
echo "> Downloading and subsampling of datasets complete"
# Run spaceranger
viash run src/mapping/spaceranger_count/config.vsh.yaml -- \
--input "$DIR/Visium_FFPE_Human_Ovarian_Cancer_tiny" \
--gex_reference "$REPO_ROOT/resources_test/GRCh38/" \
--probe_set "$DIR/Visium_FFPE_Human_Ovarian_Cancer_probe_set.csv" \
--image "$DIR/Visium_FFPE_Human_Ovarian_Cancer_image_tiny.jpg" \
--slide "V10L13-020" \
--area "D1" \
--create_bam "false" \
--output "Visium_FFPE_Human_Ovarian_Cancer_tiny_spaceranger"
mv
echo "> Running spaceranger complete"
rm -rf "$DIR/Visium_FFPE_Human_Ovarian_Cancer_fastqs"
rm -f "$DIR/Visium_FFPE_Human_Ovarian_Cancer_image.jpg"
aws s3 sync \
--profile di \
--exclude "*.yaml" \
"$DIR" \
s3://openpipelines-bio/openpipeline_spatial/resources_test/visium \
--delete \

View File

@@ -1,2 +1,3 @@
packages:
- anndata~=0.11.1
- anndata~=0.12.7
- awkward

View File

@@ -1,5 +1,5 @@
__merge__: [/src/base/requirements/anndata.yaml, .]
packages:
- mudata~=0.3.1
- mudata~=0.3.2
script: |
exec("try:\n import awkward\nexcept ModuleNotFoundError:\n exit(0)\nelse: exit(1)")
exec("try:\n import zarr; from importlib.metadata import version\nexcept ModuleNotFoundError:\n exit(0)\nelse: assert int(version(\"zarr\").partition(\".\")[0]) > 2")

View File

@@ -1,3 +1,3 @@
packages:
- spatialdata-io~=0.3.0
- spatialdata-io~=0.5.1
__merge__: [ ., /src/base/requirements/spatialdata.yaml ]

View File

@@ -1,3 +1,3 @@
packages:
- spatialdata~=0.5.0
- spatialdata~=0.6.1
- pyarrow~=18.0.0

View File

@@ -1,3 +1,4 @@
__merge__: [/src/base/requirements/spatialdata.yaml, .]
packages:
- squidpy~=1.6.5
- squidpy~=1.7.0
__merge__: [/src/base/requirements/scanpy.yaml, .]

View File

@@ -35,7 +35,7 @@ logger = setup_logger()
def assert_matching_order(var_names, count_columns, split_pattern=None):
for var, col in zip(var_names, count_columns):
count_var = col if not split_pattern else col.split("_Nuclear")[0]
count_var = col if not split_pattern else col.replace(split_pattern, "")
assert var == count_var, "Orders do not match"
@@ -224,8 +224,8 @@ def main():
df.index_name = None
# var and obs names
var_names = [var.split(".")[0] for var in count_columns]
obs_names = df["Cell"].astype(str).tolist()
var_columns = list(count_columns)
obs_columns = df["Cell"].astype(str).tolist()
# Count matrix
logger.info("Creating count matrix...")
@@ -236,16 +236,21 @@ def main():
logger.info(f"Creating obs field with columns {obs_columns_fixed}")
obs_df = df[obs_columns_fixed].copy()
# Var field
var_df = pd.DataFrame(index=pd.Index(var_columns, dtype=str))
targets, batches = zip(*(c.rsplit(".", 1) for c in var_columns))
var_df["target"] = targets
var_df["batch"] = batches
# Create AnnData object
logger.info("Creating AnnData object...")
adata = ad.AnnData(
X=count_matrix_sparse,
obs=obs_df,
var=pd.DataFrame(index=var_names),
var=var_df,
)
adata.obs_names = obs_names
adata.var_names = var_names
adata.obs_names = pd.Index(obs_columns, dtype=str)
adata.var_names = pd.Index(var_columns, dtype=str)
# Spatial coordinates
coordinate_sets = {
@@ -282,13 +287,13 @@ def main():
adata.uns[par["obsm_cell_profiler"]] = cell_profiler_columns
if par["obsm_unassigned_targets"]:
logger.info(f"Adding {par['obsm_unassigned_targets']} to obsm")
adata.obsm["unassigned_targets"] = df[unassigned_columns].copy()
adata.uns["unassigned_targets"] = unassigned_columns
adata.obsm[par["obsm_unassigned_targets"]] = df[unassigned_columns].copy()
adata.uns[par["obsm_unassigned_targets"]] = unassigned_columns
# Add (optional) nuclear count layer
if par["layer_nuclear_counts"]:
assert_matching_order(
var_names, nuclear_count_columns, split_pattern="_Nuclear"
var_columns, nuclear_count_columns, split_pattern="_Nuclear"
)
logger.info(f"Adding {par['layer_nuclear_counts']} to layers")
nuclear_count_df = df[nuclear_count_columns].copy()

View File

@@ -44,7 +44,7 @@ test_resources:
engines:
- type: docker
image: rocker/r2u:22.04
image: rocker/r2u:24.04
setup:
- type: apt
packages:
@@ -56,7 +56,8 @@ engines:
test_setup:
- type: docker
env:
- RETICULATE_PYTHON=/usr/bin/python
- RETICULATE_PYTHON=/usr/bin/python
- PIP_BREAK_SYSTEM_PACKAGES=1
- type: apt
packages:
- python3
@@ -66,6 +67,7 @@ engines:
- type: r
cran: [ reticulate, testthat ]
- type: python
user: true
__merge__: /src/base/requirements/anndata_mudata.yaml
runners:

View File

@@ -0,0 +1,90 @@
name: "from_spaceranger_to_h5mu"
namespace: "convert"
scope: "public"
description: |
Converts the output bundle from spaceranger into an h5mu file.
authors:
- __merge__: /src/authors/dorien_roosen.yaml
roles: [ maintainer ]
argument_groups:
- name: Inputs
arguments:
- name: "--input"
alternatives: ["-i"]
type: file
description: |
Convert spatial data resulting from Aviti Teton sequencers that have been processed by the Element Biosciences cells2stats workflow to H5MU format.
This component processes cells2stats count matrices to create a standardized H5MU file for downstream analysis.
The component reads:
- Parquet file containing the count matrix and metadata
- Panel.json with target and batch information
And outputs an H5MU file with:
- Count data as the main .X matrix
- Spatial coordinates in obsm
- Cell Paint intensities in obsm (optional)
- Nuclear count data as a layer (optional)
- CellProfiler morphology metrics in obsm (optional)
- Unassigned targets in obsm (optional)
example: spaceranger_output
direction: input
required: true
- name: Outputs
arguments:
- name: "--output"
type: file
description: Output h5mu file.
example: output.h5mu
direction: output
- name: "--modality"
type: string
description: Name of the modality under which to store the data.
default: "rna"
- name: "--uns_metrics"
type: string
description: Name of the .uns slot under which to QC metrics (if any).
default: "metrics_spaceranger"
- name: "--uns_probe_set"
type: string
description: Name of the .uns slot under which to store probe set information (if any).
default: "probe_set"
- name: "--obsm_coordinates"
type: string
description: Name of the .obsm slot under which to store the cell centroid coordinates.
default: "spatial"
- name: "--output_type"
type: string
description: "Which Spaceranger output to use for converting to h5mu."
choices: [ raw, filtered ]
default: filtered
- name: "--output_compression"
type: string
description: Compression to use when writing the h5mu file.
choices: [ gzip, lzf ]
resources:
- type: python_script
path: script.py
- path: /src/utils/setup_logger.py
test_resources:
- type: python_script
path: test.py
- path: /resources_test/visium/Visium_FFPE_Human_Ovarian_Cancer_tiny_spaceranger
engines:
- type: docker
image: python:3.12-slim
setup:
- type: apt
packages:
- procps
- type: python
__merge__: [/src/base/requirements/anndata_mudata.yaml, /src/base/requirements/scanpy.yaml, .]
__merge__: [ /src/base/requirements/python_test_setup.yaml, .]
runners:
- type: executable
- type: nextflow
directives:
label: [lowmem, singlecpu]

View File

@@ -0,0 +1,134 @@
from pathlib import Path
import mudata
import scanpy as sc
import sys
import pandas as pd
## VIASH START
par = {
"input": "resources_test/visium/Visium_FFPE_Human_Ovarian_Cancer_tiny_spaceranger",
"modality": "rna",
"uns_metrics": "metrics_spaceranger",
"uns_probe_set": "probe_set",
"obsm_coordinates": "spatial",
"output": "foo.h5mu",
"min_genes": None,
"min_counts": None,
"output_compression": "gzip",
"output_type": "filtered",
}
meta = {"resources_dir": "src/utils"}
## VIASH END
sys.path.append(meta["resources_dir"])
from setup_logger import setup_logger
logger = setup_logger()
def retrieve_input_data(spaceranger_output_bundle, input_type="filtered"):
# Expected folder structure (showing only relevant files):
# ├── Spatial/
# │ └── tissue_positions.csv
# ├── filtered_feature_bc_matrix.h5 OR raw_feature_bc_matrix.h5
# ├── metrics_summary.csv
# └── probe_set.csv
matrix_pattern = (
"**/filtered_feature_bc_matrix.h5"
if input_type == "filtered"
else "**/raw_feature_bc_matrix.h5"
)
spaceranger_file_patterns = {
"count_matrix": matrix_pattern,
"metrics_summary": "**/metrics_summary.csv",
"probe_set": "**/probe_set.csv",
"spatial_coords": "**/spatial/tissue_positions.csv",
}
spaceranger_output_bundle = Path(spaceranger_output_bundle)
spaceranger_files = {}
for key, pattern in spaceranger_file_patterns.items():
file = list(spaceranger_output_bundle.glob(pattern))
assert len(file) == 1, (
f"Expected exactly one file for pattern '{pattern}', found {len(file)}."
)
spaceranger_files[key] = file[0]
return spaceranger_files
def main():
spaceranger_files = retrieve_input_data(par["input"], input_type=par["output_type"])
logger.info("Reading count matrix...")
adata = sc.read_10x_h5(spaceranger_files["count_matrix"], gex_only=False)
# set the gene ids as var_names
logger.info("Renaming var columns")
adata.var = adata.var.rename_axis("gene_symbol").reset_index().set_index("gene_ids")
if par["uns_metrics"]:
logger.info("Reading metrics summary file...")
metrics_summary = pd.read_csv(
spaceranger_files["metrics_summary"],
decimal=".",
quotechar='"',
thousands=",",
)
logger.info("Storing metrics summary in .uns slot...")
adata.uns[par["uns_metrics"]] = metrics_summary
if par["uns_probe_set"]:
logger.info("Reading probe set file...")
def read_hash_metadata(path):
meta = {}
with open(path, "r", encoding="utf-8") as f:
for i, line in enumerate(f):
if not line.startswith("#"):
break
line = line[1:].strip()
if "=" in line:
k, v = line.split("=", 1)
meta[k.strip()] = v.strip()
return meta
meta = read_hash_metadata(spaceranger_files["probe_set"])
probe_set = pd.read_csv(spaceranger_files["probe_set"], comment="#")
logger.info("Storing probe set in .uns slot...")
adata.uns[par["uns_probe_set"]] = probe_set
adata.uns[par["uns_probe_set"] + "_meta"] = meta
logger.info("Reading spatial coordinates...")
spatial_coords = pd.read_csv(
spaceranger_files["spatial_coords"], decimal=".", thousands=","
)
spatial_coords_aligned = spatial_coords.set_index("barcode").reindex(
adata.obs_names
)
logger.info("Storing spatial coordinates in .obsm slot...")
adata.obsm[par["obsm_coordinates"]] = spatial_coords_aligned[
["pxl_col_in_fullres", "pxl_row_in_fullres"]
].to_numpy()
# generate output
logger.info("Convert to mudata")
mdata = mudata.MuData({par["modality"]: adata})
# override root .obs and .uns
mdata.obs = adata.obs
mdata.uns = adata.uns
# write output
logger.info("Writing %s", par["output"])
mdata.write_h5mu(par["output"], compression=par["output_compression"])
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,44 @@
import pytest
import sys
import mudata as mu
## VIASH START
meta = {
"executable": "./target/executable/convert/from_spaceranger_to_h5mu/from_spaceranger_to_h5mu",
"resources_dir": "resources_test/",
"config": "src/convert/from_spaceranger_to_h5mu/config.vsh.yaml",
}
## VIASH END
input = f"{meta['resources_dir']}/Visium_FFPE_Human_Ovarian_Cancer_tiny_spaceranger"
def test_simple_execution(run_component, tmp_path):
output = tmp_path / "xenium.h5mu"
# run component
run_component(
["--input", input, "--output", str(output), "--output_compression", "gzip"]
)
assert output.is_file(), "output file was not created"
mdata = mu.read_h5mu(output)
assert list(mdata.mod.keys()) == ["rna"], "Expected modality rna"
adata = mdata.mod["rna"]
assert list(adata.uns.keys()) == [
"metrics_spaceranger",
"probe_set",
"probe_set_meta",
]
assert list(adata.obsm.keys()) == ["spatial"]
assert list(adata.var.keys()) == ["gene_symbol", "feature_types", "genome"]
assert adata.X.dtype.kind == "f"
assert all(adata.var["feature_types"] == "Gene Expression")
assert adata.obsm["spatial"].dtype == "float"
if __name__ == "__main__":
sys.exit(pytest.main([__file__]))

View File

@@ -29,7 +29,9 @@ def test_simple_execution(run_component, tmp_path):
assert os.path.exists(output_sd_path / "points"), "images folder was not created"
assert os.path.exists(output_sd_path / "shapes"), "shapes folder was not created"
assert os.path.exists(output_sd_path / "tables"), "tables folder was not created"
assert (output_sd_path / "zmetadata").is_file(), "zmetadata file was not created"
assert (output_sd_path / "zarr.json").is_file(), (
"zarr metadata file was not created"
)
def test_compressed_input(run_component, tmp_path):
@@ -62,7 +64,9 @@ def test_compressed_input(run_component, tmp_path):
assert os.path.exists(output_sd_path / "points"), "images folder was not created"
assert os.path.exists(output_sd_path / "shapes"), "shapes folder was not created"
assert os.path.exists(output_sd_path / "tables"), "tables folder was not created"
assert (output_sd_path / "zmetadata").is_file(), "zmetadata file was not created"
assert (output_sd_path / "zarr.json").is_file(), (
"zarr metadata file was not created"
)
if __name__ == "__main__":

View File

@@ -1,105 +0,0 @@
name: concatenate_h5mu
namespace: "dataflow"
scope: "public"
description: |
Concatenate observations from samples in several (uni- and/or multi-modal) MuData files into a single file.
authors:
- __merge__: /src/authors/dries_schaumont.yaml
roles: [ maintainer ]
arguments:
- name: "--input"
alternatives: ["-i"]
type: file
multiple: true
description: Paths to the different samples to be concatenated.
required: true
example: sample_paths
- name: "--modality"
type: string
multiple: true
description: "Only output concatenated objects for the provided modalities. Outputs all modalities by default."
required: false
- name: "--input_id"
type: string
multiple: true
description: |
Names of the different samples that have to be concatenated. Must be specified when using '--mode move'.
In this case, the ids will be used for the columns names of the dataframes registring the conflicts.
If specified, must be of same length as `--input`.
required: false
- name: "--output"
description: |
Output location for the concatenated MuData object file.
alternatives: ["-o"]
type: file
direction: output
example: "output.h5mu"
- name: "--obs_sample_name"
type: string
description: Name of the .obs key under which to add the sample names.
default: "sample_id"
- name: "--other_axis_mode"
type: string
choices: [same, unique, first, only, concat, move]
default: move
description: |
How to handle the merging of other axis (var, obs, ...).
- None: keep no data
- same: only keep elements of the matrices which are the same in each of the samples
- unique: only keep elements for which there is only 1 possible value (1 value that can occur in multiple samples)
- first: keep the annotation from the first sample
- only: keep elements that show up in only one of the objects (1 unique element in only 1 sample)
- move: identical to 'same', but moving the conflicting values to .varm or .obsm
- name: "--uns_merge_mode"
description: |
How to handle the merging of .uns across modalities
- None: keep no data
- same: only keep elements of the matrices which are the same in each of the samples
- unique: only keep elements for which there is only 1 possible value (1 value that can occur in multiple samples)
- first: keep the annotation from the first sample
- only: keep elements that show up in only one of the objects (1 unique element in only 1 sample)
- make_unique: identical to 'unique', but keys which are not unique are made unique by prefixing them with the sample id.
type: string
choices: ["same", "unique", "first", "only", "make_unique"]
default: make_unique
- name: "--obsp_keys"
type: string
multiple: true
description: |
List of `.obsp` keys for which block-diagonal concatenation should be performed.
If not provided, no `.obsp` keys will be concatenated.
Provided keys must be present in all samples for block concatenation to be performed.
required: false
__merge__: [., /src/base/h5_compression_argument.yaml]
resources:
- type: python_script
path: script.py
- path: /src/utils/setup_logger.py
- path: /src/utils/compress_h5mu.py
test_resources:
- type: python_script
path: test.py
- path: /resources_test/concat_test_data/e18_mouse_brain_fresh_5k_filtered_feature_bc_matrix_subset_unique_obs.h5mu
- path: /resources_test/concat_test_data/human_brain_3k_filtered_feature_bc_matrix_subset_unique_obs.h5mu
engines:
- type: docker
image: python:3.13-slim
setup:
- type: apt
packages:
- procps
- type: python
__merge__: [/src/base/requirements/anndata_mudata.yaml, .]
__merge__: [ /src/base/requirements/python_test_setup.yaml, .]
test_setup:
- type: python
packages:
- pytest-benchmark
__merge__: [ /src/base/requirements/viashpy.yaml, .]
runners:
- type: executable
- type: nextflow
directives:
label: [midcpu, highmem]

View File

@@ -1,407 +0,0 @@
from __future__ import annotations
import sys
import anndata
import mudata as mu
import pandas as pd
import numpy as np
from collections.abc import Iterable
from multiprocessing import Pool
from pathlib import Path
from h5py import File as H5File
from typing import Literal
import shutil
### VIASH START
par = {
"input": [
"resources_test/concat_test_data/e18_mouse_brain_fresh_5k_filtered_feature_bc_matrix_subset_unique_obs.h5mu",
"resources_test/concat_test_data/human_brain_3k_filtered_feature_bc_matrix_subset_unique_obs.h5mu",
],
"output": "foo.h5mu",
"input_id": ["mouse", "human"],
"obsp_keys": [],
"other_axis_mode": "move",
"output_compression": "gzip",
"uns_merge_mode": "make_unique",
}
meta = {"cpus": 10, "resources_dir": "resources_test/"}
### VIASH END
sys.path.append(meta["resources_dir"])
from compress_h5mu import compress_h5mu
from setup_logger import setup_logger
logger = setup_logger()
def nunique(row):
unique = pd.unique(row)
unique_without_na = pd.core.dtypes.missing.remove_na_arraylike(unique)
return len(unique_without_na) > 1
def any_row_contains_duplicate_values(n_processes: int, frame: pd.DataFrame) -> bool:
"""
Check if any row contains duplicate values, that are not NA.
"""
numpy_array = frame.to_numpy()
with Pool(n_processes) as pool:
is_duplicated = pool.map(nunique, iter(numpy_array))
return any(is_duplicated)
def concatenate_matrices(
n_processes: int, matrices: dict[str, pd.DataFrame], align_to: pd.Index
) -> tuple[
dict[str, pd.DataFrame], pd.DataFrame | None, dict[str, pd.core.dtypes.dtypes.Dtype]
]:
"""
Merge matrices by combining columns that have the same name.
Columns that contain conflicting values (e.i. the columns have different values),
are not merged, but instead moved to a new dataframe.
"""
column_names = set(column_name for var in matrices.values() for column_name in var)
logger.debug("Trying to concatenate columns: %s.", ",".join(column_names))
if not column_names:
return {}, pd.DataFrame(index=align_to)
conflicts, concatenated_matrix = split_conflicts_and_concatenated_columns(
n_processes, matrices, column_names, align_to
)
concatenated_matrix = cast_to_writeable_dtype(concatenated_matrix)
conflicts = {
conflict_name: cast_to_writeable_dtype(conflict_df)
for conflict_name, conflict_df in conflicts.items()
}
return conflicts, concatenated_matrix
def get_first_non_na_value_vector(df):
numpy_arr = df.to_numpy()
n_rows, n_cols = numpy_arr.shape
col_index = pd.isna(numpy_arr).argmin(axis=1)
flat_index = n_cols * np.arange(n_rows) + col_index
return pd.Series(numpy_arr.ravel()[flat_index], index=df.index, name=df.columns[0])
def make_uns_keys_unique(mod_data, concatenated_data):
"""
Check if the uns keys across samples are unique before adding them
to the final concatenated object. If a conflict occurs between the samples,
add the sample ID to make the key unique again.
"""
all_uns_keys = {}
for sample_id, mod in mod_data.items():
for uns_key, _ in mod.uns.items():
all_uns_keys.setdefault(uns_key, []).append(sample_id)
for uns_key, samples_ids in all_uns_keys.items():
assert samples_ids
if len(samples_ids) == 1:
sample_id = samples_ids[0]
concatenated_data.uns[uns_key] = mod_data[sample_id].uns[uns_key]
else:
for sample_id in samples_ids:
concatenated_data.uns[f"{sample_id}_{uns_key}"] = mod_data[
sample_id
].uns[uns_key]
return concatenated_data
def split_conflicts_and_concatenated_columns(
n_processes: int,
matrices: dict[str, pd.DataFrame],
column_names: Iterable[str],
align_to: pd.Index,
) -> tuple[dict[str, pd.DataFrame], pd.DataFrame]:
"""
Retrieve columns with the same name from a list of dataframes which are
identical across all the frames (ignoring NA values).
Columns which are not the same are regarded as 'conflicts',
which are stored in seperate dataframes, one per columns
with the same name that store conflicting values.
"""
conflicts = {}
concatenated_matrix = []
for column_name in column_names:
columns = {
input_id: var[column_name]
for input_id, var in matrices.items()
if column_name in var
}
assert columns, "Some columns should have been found."
concatenated_columns = pd.concat(
columns.values(), axis=1, join="outer", sort=False
)
if any_row_contains_duplicate_values(n_processes, concatenated_columns):
concatenated_columns.columns = (
columns.keys()
) # Use the sample id as column name
concatenated_columns = concatenated_columns.reindex(align_to, copy=False)
conflicts[f"conflict_{column_name}"] = concatenated_columns
else:
unique_values = get_first_non_na_value_vector(concatenated_columns)
concatenated_matrix.append(unique_values)
if not concatenated_matrix:
return conflicts, pd.DataFrame(index=align_to)
concatenated_matrix = pd.concat(
concatenated_matrix, join="outer", axis=1, sort=False
)
concatenated_matrix = concatenated_matrix.reindex(align_to, copy=False)
return conflicts, concatenated_matrix
def cast_to_writeable_dtype(result: pd.DataFrame) -> pd.DataFrame:
"""
Cast the dataframe to dtypes that can be written by mudata.
"""
# dtype inferral workfs better with np.nan
result = result.replace({pd.NA: np.nan})
# MuData supports nullable booleans and ints
# ie. `IntegerArray` and `BooleanArray`
result = result.convert_dtypes(
infer_objects=True,
convert_integer=True,
convert_string=False,
convert_boolean=True,
convert_floating=False,
)
# Convert leftover 'object' columns to string
# However, na values are supported, so convert all values except NA's to string
object_cols = result.select_dtypes(include="object").columns.values
for obj_col in object_cols:
result[obj_col] = (
result[obj_col]
.where(result[obj_col].isna(), result[obj_col].astype(str))
.astype("category")
)
return result
def split_conflicts_modalities(
n_processes: int, samples: dict[str, anndata.AnnData], output: anndata.AnnData
) -> anndata.AnnData:
"""
Merge .var and .obs matrices of the anndata objects. Columns are merged
when the values (excl NA) are the same in each of the matrices.
Conflicting columns are moved to a separate dataframe (one dataframe for each column,
containing all the corresponding column from each sample).
"""
matrices_to_parse = ("var", "obs")
for matrix_name in matrices_to_parse:
matrices = {
sample_id: getattr(sample, matrix_name)
for sample_id, sample in samples.items()
}
output_index = getattr(output, matrix_name).index
conflicts, concatenated_matrix = concatenate_matrices(
n_processes, matrices, output_index
)
if concatenated_matrix.empty:
concatenated_matrix.index = output_index
# Even though we did not touch the varm and obsm matrices that were already present,
# the joining of observations might have caused a dtype change in these matrices as well
# so these also need to be casted to a writable dtype...
for multidim_name, multidim_data in getattr(output, f"{matrix_name}m").items():
new_data = (
cast_to_writeable_dtype(multidim_data)
if isinstance(multidim_data, pd.DataFrame)
else multidim_data
)
getattr(output, f"{matrix_name}m")[multidim_name] = new_data
# Write the conflicts to the output
for conflict_name, conflict_data in conflicts.items():
getattr(output, f"{matrix_name}m")[conflict_name] = conflict_data
# Set other annotation matrices in the output
setattr(output, matrix_name, concatenated_matrix)
return output
def concatenate_modality(
n_processes: int,
mod: str | None,
input_files: Iterable[str | Path],
other_axis_mode: str,
uns_merge_mode: str,
input_ids: tuple[str],
) -> anndata.AnnData:
concat_modes = {
"move": "unique",
}
other_axis_mode_to_apply = concat_modes.get(other_axis_mode, other_axis_mode)
uns_merge_modes = {"make_unique": None}
uns_merge_mode_to_apply = uns_merge_modes.get(uns_merge_mode, uns_merge_mode)
mod_data = {}
mod_indices_combined = pd.Index([])
for input_id, input_file in zip(input_ids, input_files):
if mod is not None:
try:
data = mu.read_h5ad(input_file, mod=mod)
# Remove obsp keys that are not in par["obsp_keys"]
if par["obsp_keys"]:
# Keep only the obsp keys that are specified in par["obsp_keys"]
keys_to_remove = set(data.obsp.keys()) - set(par["obsp_keys"])
for key in keys_to_remove:
del data.obsp[key]
mod_data[input_id] = data
mod_indices_combined = mod_indices_combined.append(data.obs.index)
except KeyError as e: # Modality does not exist for this sample, skip it
if (
f"Unable to synchronously open object (object '{mod}' doesn't exist)"
not in str(e)
):
raise e
pass
else: # When mod=None, process the 'global' h5mu state
with H5File(input_file, "r") as input_h5:
if "uns" in input_h5.keys():
uns_data = anndata.experimental.read_elem(input_h5["uns"])
if uns_data:
mod_data[input_id] = anndata.AnnData(uns=uns_data)
if not mod_indices_combined.is_unique:
raise ValueError("Observations are not unique across samples.")
if not mod_data:
return anndata.AnnData()
concatenated_data = anndata.concat(
mod_data.values(),
join="outer",
pairwise=True if par["obsp_keys"] else False,
merge=other_axis_mode_to_apply,
uns_merge=uns_merge_mode_to_apply,
)
if other_axis_mode == "move":
concatenated_data = split_conflicts_modalities(
n_processes, mod_data, concatenated_data
)
if uns_merge_mode == "make_unique":
concatenated_data = make_uns_keys_unique(mod_data, concatenated_data)
return concatenated_data
def concatenate_modalities(
n_processes: int,
modalities: list[str],
input_files: Path | str,
other_axis_mode: str,
uns_merge_mode: str,
output_file: Path | str,
compression: Literal["gzip"] | Literal["lzf"],
input_ids: tuple[str] | None = None,
) -> None:
"""
Join the modalities together into a single multimodal sample.
"""
logger.info("Concatenating samples.")
output_file, input_files = (
Path(output_file),
[Path(input_file) for input_file in input_files],
)
output_file_uncompressed = output_file.with_name(
output_file.stem + "_uncompressed.h5mu"
)
output_file_uncompressed.touch()
# Create empty mudata file
mdata = mu.MuData({modality: anndata.AnnData() for modality in modalities})
mdata.write(output_file_uncompressed, compression=compression)
# Use "None" for the global slots (not assigned to any modality)
for mod_name in modalities + [
None,
]:
new_mod = concatenate_modality(
n_processes,
mod_name,
input_files,
other_axis_mode,
uns_merge_mode,
input_ids,
)
if mod_name is None:
if new_mod.uns:
with H5File(output_file_uncompressed, "r+") as open_h5mu_file:
anndata.experimental.write_elem(
open_h5mu_file, "uns", dict(new_mod.uns)
)
continue
logger.info(
"Writing out modality '%s' to '%s' with compression '%s'.",
mod_name,
output_file_uncompressed,
compression,
)
mu.write_h5ad(output_file_uncompressed, data=new_mod, mod=mod_name)
if compression:
compress_h5mu(output_file_uncompressed, output_file, compression=compression)
output_file_uncompressed.unlink()
else:
shutil.move(output_file_uncompressed, output_file)
logger.info("Concatenation successful.")
def main() -> None:
# Get a list of all possible modalities
mods = set()
for path in par["input"]:
try:
with H5File(path, "r") as f_root:
mods = mods | set(f_root["mod"].keys())
except OSError:
raise OSError(f"Failed to load {path}. Is it a valid h5 file?")
input_ids = None
if par["input_id"]:
input_ids: tuple[str] = tuple(i.strip() for i in par["input_id"])
if len(input_ids) != len(par["input"]):
raise ValueError(
"The number of sample names must match the number of sample files."
)
if len(set(input_ids)) != len(input_ids):
raise ValueError("The sample names should be unique.")
logger.info("\nConcatenating data from paths:\n\t%s", "\n\t".join(par["input"]))
if par["other_axis_mode"] == "move" and not input_ids:
raise ValueError("--mode 'move' requires --input_ids.")
n_processes = meta["cpus"] if meta["cpus"] else 1
if par["modality"]:
par["modality"] = set(par["modality"])
if not par["modality"].issubset(mods):
mods_joined, input_mods_joined = ", ".join(mods), ", ".join(par["modality"])
raise ValueError(
f"One of the modalities provided ({input_mods_joined}) is not available in the input data {mods_joined}"
)
mods = par["modality"]
concatenate_modalities(
n_processes,
list(mods),
par["input"],
par["other_axis_mode"],
par["uns_merge_mode"],
par["output"],
par["output_compression"],
input_ids=input_ids,
)
if __name__ == "__main__":
main()

File diff suppressed because it is too large Load Diff

View File

@@ -23,7 +23,7 @@ for par in ${unset_if_false[@]}; do
[[ "$test_val" == "false" ]] && unset $par
done
# just to make sure paths are absolute
# Make sure paths are absolute
par_gex_reference=`realpath $par_gex_reference`
par_output=`realpath $par_output`
par_probe_set=`realpath $par_probe_set`

View File

@@ -439,14 +439,8 @@ test_resources:
engines:
- type: docker
image: nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04
image: pytorch/pytorch:2.6.0-cuda12.4-cudnn9-runtime
setup:
- type: apt
packages:
- libhdf5-dev
- python3-pip
- python3-dev
- python-is-python3
- type: docker
run: |
pip install torch --index-url https://download.pytorch.org/whl/cu124 \

View File

@@ -1,6 +1,7 @@
import sys
import json
import mudata as mu
import torch
from nichecompass.models import NicheCompass
from nichecompass.utils import add_gps_from_gp_dict_to_adata
@@ -91,6 +92,12 @@ from setup_logger import setup_logger
logger = setup_logger()
# Verify torch and CUDA availability
logger.info(f"Torch version: {torch.__version__}")
logger.info(f"CUDA available: {torch.cuda.is_available()}")
logger.info(f"Torch CUDA version: {torch.version.cuda}")
logger.info(f"GPU count: {torch.cuda.device_count()}")
## Read in data
adata = mu.read_h5ad(par["input"], mod=par["modality"])

View File

@@ -0,0 +1,211 @@
name: "spaceranger_mapping"
namespace: "workflows/ingestion"
scope: "public"
description: "A pipeline for running SpaceRanger mapping."
info:
name: SpaceRanger mapping
test_dependencies:
- name: spaceranger_mapping_test
namespace: test_workflows/ingestion
authors:
- __merge__: /src/authors/dorien_roosen.yaml
roles: [ maintainer ]
- __merge__: /src/authors/weiwei_schultz.yaml
roles: [ contributor ]
argument_groups:
- name: Inputs
arguments:
- name: "--id"
required: true
type: string
description: ID of the sample.
example: foo
- name: --input
type: file
required: true
multiple: true
description: |
The fastq.gz files to align. Can also be a single directory containing fastq.gz files.
Individual FASTQ files should follow the naming convention of 10x Genomics:
[Sample Name]_S[Sample Number]_L[Lane Number]_[Read Type]_001.fastq.gz
Where:
[Sample Name] is the name assigned during sample preparation/sequencing
S[Sample Number] is the sample index (usually S1, S2, etc.)
L[Lane Number] identifies the sequencing lane (L001, L002, etc.)
[Read Type] will be one of:
R1 - Read 1 (contains the spatial barcode and UMI)
R2 - Read 2 (contains the actual cDNA sequence)
example: [ "sample_S1_L001_R1_001.fastq.gz", "sample_S1_L001_R2_001.fastq.gz" ]
- name: --gex_reference
type: file
required: true
description: Path of folder containing 10x-compatible reference
example: "/path/to/refdata-gex-GRCh38-2020-A"
- name: --probe_set
type: file
required: true
description: CSV file specifying the probe set used
example: "Visium_Human_Transcriptome_Probe_Set_v2.0_GRCh38-2020-A.csv"
- name: --cytaimage
type: file
required: false
description: |
Brightfield image generated by the CytAssist instrument.
When using CytAssist workflow, either this or --image must be provided.
example: "cyta_image.tif"
- name: --image
type: file
required: false
description: |
H&E or fluorescence microscope image in TIFF or JPG format.
Required for standard Visium workflow, optional when using --cytaimage for CytAssist workflow.
example: "brightfield.tif"
- name: Outputs
arguments:
- name: "--output_raw"
type: file
direction: output
description: "Location where the output folder from Cell Ranger will be stored."
required: true
example: output_dir/
- name: "--output_h5mu"
type: file
direction: output
description: "The output from Cell Ranger, converted to h5mu."
required: true
example: output.h5mu
- name: "--output_type"
type: string
description: "Which Cell Ranger output to use for converting to h5mu."
choices: [ raw, filtered ]
default: raw
- name: "--uns_metrics"
type: string
description: Name of the .uns slot under which to QC metrics (if any).
default: "metrics_summary"
- name: "--uns_probe_set"
type: string
description: Name of the .uns slot under which to store probe set information (if any).
default: "probe_set"
- name: "--obsm_coordinates"
type: string
description: Name of the .obsm slot under which to store the cell centroid coordinates.
default: "spatial"
- name: "--output_compression"
type: string
description: Compression to use when writing the h5mu file.
choices: [ gzip, lzf ]
- name: Image Options
arguments:
- name: --darkimage
type: file
description: Multi-channel, dark-background fluorescence image
required: false
example: "fluorescence.tif"
- name: --colorizedimage
type: file
description: Color image representing pre-colored dark-background fluorescence images
required: false
example: "colored_fluorescence.tif"
- name: --dapi_index
type: integer
description: Index of DAPI channel (1-indexed) of fluorescence image
required: false
example: 1
min: 1
- name: --image_scale
type: double
description: Microns per microscope image pixel
required: false
example: 0.65
min: 0.01
max: 10
- name: --reorient_images
type: boolean
default: true
description: Whether to rotate and mirror image to align fiducial pattern
- name: Slide Information
arguments:
- name: --slide
type: string
description: Visium slide serial number (e.g., 'V10J25-015')
required: false
example: "V10J25-015"
- name: --area
type: string
description: Visium capture area identifier (e.g., 'A1')
required: false
example: "A1"
- name: --unknown_slide
type: string
description: |
Use this option if the slide serial number and area were entered incorrectly on the CytAssist
instrument and the correct values are unknown. Not compatible with --slide, --area, or
--slide-file options
required: false
choices: [visium-1, visium-2, visium-2-large, visium-hd]
- name: --slidefile
type: file
description: Slide design file for offline use
required: false
example: "slide_design.gpr"
- name: --override_id
type: boolean_true
description: Overrides the slide serial number and capture area provided in the Cytassist image metadata
- name: SpaceRanger arguments
arguments:
- name: --create_bam
type: boolean
required: true
description: Enable or disable BAM file generation
default: true
- name: --nosecondary
type: boolean_true
description: Disable secondary analysis (e.g., clustering)
- name: --r1_length
type: integer
required: false
description: Hard trim the input Read 1 to this length before analysis
min: 1
- name: --r2_length
type: integer
required: false
description: Hard trim the input Read 2 to this length before analysis
min: 1
- name: --filter_probes
type: boolean
default: true
description: Whether to filter the probe set using the "included" column
- name: --custom_bin_size
type: integer
description: Bin Visium HD data to specified size in microns (4-100, even values only) in addition to the standard binning size (2 µm, 8 µm, 16 µm)
min: 4
max: 100
dependencies:
- name: mapping/spaceranger_count
- name: convert/from_spaceranger_to_h5mu
resources:
- type: nextflow_script
path: main.nf
entrypoint: run_wf
- type: file
path: /src/workflows/utils/
test_resources:
- type: nextflow_script
path: test.nf
entrypoint: test_wf
- path: /resources_test/visium
- path: /resources_test/GRCh38
runners:
- type: nextflow

View File

@@ -0,0 +1,15 @@
#!/bin/bash
# get the root of the directory
REPO_ROOT=$(git rev-parse --show-toplevel)
# ensure that the command below is run from the root of the repository
cd "$REPO_ROOT"
nextflow \
run . \
-main-script src/workflows/ingestion/spaceranger_mapping/test.nf \
-entry test_wf \
-profile docker \
-c src/workflows/utils/labels_ci.config \
-c src/workflows/utils/integration_tests.config

View File

@@ -0,0 +1,57 @@
workflow run_wf {
take:
input_ch
main:
output_ch = input_ch
| spaceranger_count.run(
fromState: { id, state -> [
"input": state.input,
"gex_reference": state.gex_reference,
"probe_set": state.probe_set,
"cytaimage": state.cytaimage,
"image": state.image,
"slide": state.slide,
"area": state.area,
"unkown_slide": state.unkown_slide,
"slidefile": state.slidefile,
"override_id": state.override_id,
"darkimage": state.darkimage,
"colorizedimage": state.colorizedimage,
"dapi_index": state.dapi_index,
"image_scale": state.image_scale,
"reorient_images": state.reorient_images,
"create_bam": state.create_bam,
"nosecondary": state.nosecondary,
"r1_length": state.r1_length,
"r2_length": state.r2_length,
"filter_probes": state.filter_probes,
"custom_bin_size": state.custom_bin_size,
"output": state.output_raw,
]},
toState: [
"input": "output",
"output_raw": "output"
]
)
// convert to h5mu
| from_spaceranger_to_h5mu.run(
fromState: {id, state ->
[
"input": state.input,
"output_compression": state.output_compression,
"output": state.output_h5mu,
"uns_metrics": state.uns_metrics,
"uns_probe_set": state.uns_probe_set,
"obsm_coordinates": state.obsm_coordinates,
"output_type": state.output_type,
"output_compression": state.output_compression,
]
},
toState: ["output_h5mu": "output"]
)
| setState(["output_raw", "output_h5mu"])
emit:
output_ch
}

View File

@@ -0,0 +1,10 @@
manifest {
nextflowVersion = '!>=20.12.1-edge'
}
params {
rootDir = java.nio.file.Paths.get("$projectDir/../../../../").toAbsolutePath().normalize().toString()
}
// include common settings
includeConfig("${params.rootDir}/src/workflows/utils/labels.config")

View File

@@ -0,0 +1,42 @@
nextflow.enable.dsl=2
include { spaceranger_mapping } from params.rootDir + "/target/nextflow/workflows/ingestion/spaceranger_mapping/main.nf"
include { spaceranger_mapping_test } from params.rootDir + "/target/_test/nextflow/test_workflows/ingestion/spaceranger_mapping_test/main.nf"
params.resources_test = params.rootDir + "/resources_test"
workflow test_wf {
resources_test = file(params.resources_test)
output_ch = Channel.fromList([
[
id: "foo",
input: resources_test.resolve("visium/Visium_FFPE_Human_Ovarian_Cancer_tiny"),
gex_reference: resources_test.resolve("GRCh38"),
image: resources_test.resolve("visium/Visium_FFPE_Human_Ovarian_Cancer_image_tiny.jpg"),
probe_set: resources_test.resolve("visium/Visium_FFPE_Human_Ovarian_Cancer_probe_set.csv"),
create_bam: "false",
slide: "V10L13-020",
area: "D1",
output_type: "filtered",
]
])
| map{ state -> [state.id, state] }
| spaceranger_mapping
| view { output ->
assert output.size() == 2 : "outputs should contain two elements; [id, out]"
assert output[1] instanceof Map : "Output should be a Map."
"Output: $output"
}
| spaceranger_mapping_test.run(
fromState: ["input": "output_h5mu"]
)
| toSortedList()
| map { output_list ->
assert output_list.size() == 1 : "output channel should contain one event"
assert output_list[0][0] == "foo" : "Output ID should be same as input ID"
}
}

View File

@@ -353,6 +353,7 @@ argument_groups:
dependencies:
- name: dataflow/concatenate_h5mu
repository: openpipeline
- name: neighbors/spatial_neighborhood_graph
- name: nichecompass/nichecompass
- name: dataflow/split_h5mu

View File

@@ -0,0 +1,25 @@
name: "spaceranger_mapping_test"
namespace: "test_workflows/ingestion"
scope: "test"
description: "This component test the output of the integration test of the spaceranger mapping workflow."
authors:
- __merge__: /src/authors/dorien_roosen.yaml
argument_groups:
- name: Inputs
arguments:
- name: "--input"
type: file
required: true
description: Path to h5mu output.
example: foo.final.h5mu
resources:
- type: python_script
path: script.py
- path: /src/utils/setup_logger.py
engines:
- type: docker
image: python:3.12-slim
__merge__: /src/base/requirements/testworkflows_setup.yaml
runners:
- type: executable
- type: nextflow

View File

@@ -0,0 +1,32 @@
from mudata import read_h5mu
import sys
import pytest
##VIASH START
par = {"input": "input.h5mu"}
meta = {"resources_dir": "resources_test"}
##VIASH END
def test_run():
input_mudata = read_h5mu(par["input"])
expected_var_columns = ["gene_symbol", "feature_types", "genome"]
assert list(input_mudata.mod.keys()) == ["rna"], (
"Input should contain rna modality."
)
assert list(input_mudata.var.columns) == expected_var_columns, (
f"Input var columns should be: {expected_var_columns}."
)
assert list(input_mudata.mod["rna"].var.columns) == expected_var_columns, (
f"Input mod['rna'] var columns should be: {expected_var_columns}."
)
assert list(input_mudata.mod["rna"].obsm.keys()) == ["spatial"], (
"Input mod['rna'] obsm should contain spatial column."
)
if __name__ == "__main__":
sys.exit(pytest.main([__file__, "--import-mode=importlib"]))

View File

@@ -112,7 +112,7 @@ repositories:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v3.0.0"
tag: "v4.0.0"
links:
repository: "https://github.com/openpipelines-bio/openpipeline_spatial"
docker_registry: "ghcr.io"
@@ -201,9 +201,8 @@ engines:
- type: "python"
user: false
packages:
- "spatialdata~=0.5.0"
- "pyarrow~=18.0.0"
- "squidpy~=1.6.5"
- "scanpy~=1.10.4"
- "squidpy~=1.7.0"
upgrade: true
test_setup:
- type: "apt"
@@ -228,7 +227,7 @@ build_info:
output: "target/_private/executable/filter/subset_cosmx"
executable: "target/_private/executable/filter/subset_cosmx/subset_cosmx"
viash_version: "0.9.4"
git_commit: "f91eceb7cf408169b2847c359c6e2acd77856ff7"
git_commit: "a4d81924a673566026c204c55add247504ef1c56"
git_remote: "https://github.com/openpipelines-bio/openpipeline_spatial"
package_config:
name: "openpipeline_spatial"
@@ -242,7 +241,7 @@ package_config:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v3.0.0"
tag: "v4.0.0"
viash_version: "0.9.4"
source: "src"
target: "target"

View File

@@ -454,13 +454,13 @@ RUN apt-get update && \
rm -rf /var/lib/apt/lists/*
RUN pip install --upgrade pip && \
pip install --upgrade --no-cache-dir "spatialdata~=0.5.0" "pyarrow~=18.0.0" "squidpy~=1.6.5"
pip install --upgrade --no-cache-dir "scanpy~=1.10.4" "squidpy~=1.7.0"
LABEL org.opencontainers.image.authors="Dorien Roosen, Weiwei Schultz"
LABEL org.opencontainers.image.description="Companion container for running component filter subset_cosmx"
LABEL org.opencontainers.image.created="2026-01-26T08:53:43Z"
LABEL org.opencontainers.image.created="2026-01-27T10:47:56Z"
LABEL org.opencontainers.image.source="https://github.com/openpipelines-bio/openpipeline_spatial"
LABEL org.opencontainers.image.revision="f91eceb7cf408169b2847c359c6e2acd77856ff7"
LABEL org.opencontainers.image.revision="a4d81924a673566026c204c55add247504ef1c56"
LABEL org.opencontainers.image.version="niche-compass"
VIASHDOCKER

View File

@@ -112,7 +112,7 @@ repositories:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v3.0.0"
tag: "v4.0.0"
links:
repository: "https://github.com/openpipelines-bio/openpipeline_spatial"
docker_registry: "ghcr.io"
@@ -201,9 +201,8 @@ engines:
- type: "python"
user: false
packages:
- "spatialdata~=0.5.0"
- "pyarrow~=18.0.0"
- "squidpy~=1.6.5"
- "scanpy~=1.10.4"
- "squidpy~=1.7.0"
upgrade: true
test_setup:
- type: "apt"
@@ -228,7 +227,7 @@ build_info:
output: "target/_private/nextflow/filter/subset_cosmx"
executable: "target/_private/nextflow/filter/subset_cosmx/main.nf"
viash_version: "0.9.4"
git_commit: "f91eceb7cf408169b2847c359c6e2acd77856ff7"
git_commit: "a4d81924a673566026c204c55add247504ef1c56"
git_remote: "https://github.com/openpipelines-bio/openpipeline_spatial"
package_config:
name: "openpipeline_spatial"
@@ -242,7 +241,7 @@ package_config:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v3.0.0"
tag: "v4.0.0"
viash_version: "0.9.4"
source: "src"
target: "target"

View File

@@ -3187,7 +3187,7 @@ meta = [
"type" : "vsh",
"name" : "openpipeline",
"repo" : "openpipeline",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
}
],
"links" : {
@@ -3295,9 +3295,8 @@ meta = [
"type" : "python",
"user" : false,
"packages" : [
"spatialdata~=0.5.0",
"pyarrow~=18.0.0",
"squidpy~=1.6.5"
"scanpy~=1.10.4",
"squidpy~=1.7.0"
],
"upgrade" : true
}
@@ -3334,7 +3333,7 @@ meta = [
"engine" : "docker|native",
"output" : "/workdir/root/repo/target/_private/nextflow/filter/subset_cosmx",
"viash_version" : "0.9.4",
"git_commit" : "f91eceb7cf408169b2847c359c6e2acd77856ff7",
"git_commit" : "a4d81924a673566026c204c55add247504ef1c56",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline_spatial"
},
"package_config" : {
@@ -3354,7 +3353,7 @@ meta = [
"type" : "vsh",
"name" : "openpipeline",
"repo" : "openpipeline",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
}
],
"viash_version" : "0.9.4",

View File

@@ -0,0 +1,188 @@
name: "spaceranger_mapping_test"
namespace: "test_workflows/ingestion"
version: "niche-compass"
authors:
- name: "Dorien Roosen"
info:
role: "Core Team Member"
links:
email: "dorien@data-intuitive.com"
github: "dorien-er"
linkedin: "dorien-roosen"
organizations:
- name: "Data Intuitive"
href: "https://www.data-intuitive.com"
role: "Data Scientist"
argument_groups:
- name: "Inputs"
arguments:
- type: "file"
name: "--input"
description: "Path to h5mu output."
info: null
example:
- "foo.final.h5mu"
must_exist: true
create_parent: true
required: true
direction: "input"
multiple: false
multiple_sep: ";"
resources:
- type: "python_script"
path: "script.py"
is_executable: true
- type: "file"
path: "setup_logger.py"
- type: "file"
path: "nextflow_labels.config"
dest: "nextflow_labels.config"
description: "This component test the output of the integration test of the spaceranger\
\ mapping workflow."
info: null
status: "enabled"
scope:
image: "test"
target: "test"
repositories:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v4.0.0"
links:
repository: "https://github.com/openpipelines-bio/openpipeline_spatial"
docker_registry: "ghcr.io"
runners:
- type: "executable"
id: "executable"
docker_setup_strategy: "ifneedbepullelsecachedbuild"
- type: "nextflow"
id: "nextflow"
directives:
tag: "$id"
auto:
simplifyInput: true
simplifyOutput: false
transcript: false
publish: false
config:
labels:
mem1gb: "memory = 1000000000.B"
mem2gb: "memory = 2000000000.B"
mem5gb: "memory = 5000000000.B"
mem10gb: "memory = 10000000000.B"
mem20gb: "memory = 20000000000.B"
mem50gb: "memory = 50000000000.B"
mem100gb: "memory = 100000000000.B"
mem200gb: "memory = 200000000000.B"
mem500gb: "memory = 500000000000.B"
mem1tb: "memory = 1000000000000.B"
mem2tb: "memory = 2000000000000.B"
mem5tb: "memory = 5000000000000.B"
mem10tb: "memory = 10000000000000.B"
mem20tb: "memory = 20000000000000.B"
mem50tb: "memory = 50000000000000.B"
mem100tb: "memory = 100000000000000.B"
mem200tb: "memory = 200000000000000.B"
mem500tb: "memory = 500000000000000.B"
mem1gib: "memory = 1073741824.B"
mem2gib: "memory = 2147483648.B"
mem4gib: "memory = 4294967296.B"
mem8gib: "memory = 8589934592.B"
mem16gib: "memory = 17179869184.B"
mem32gib: "memory = 34359738368.B"
mem64gib: "memory = 68719476736.B"
mem128gib: "memory = 137438953472.B"
mem256gib: "memory = 274877906944.B"
mem512gib: "memory = 549755813888.B"
mem1tib: "memory = 1099511627776.B"
mem2tib: "memory = 2199023255552.B"
mem4tib: "memory = 4398046511104.B"
mem8tib: "memory = 8796093022208.B"
mem16tib: "memory = 17592186044416.B"
mem32tib: "memory = 35184372088832.B"
mem64tib: "memory = 70368744177664.B"
mem128tib: "memory = 140737488355328.B"
mem256tib: "memory = 281474976710656.B"
mem512tib: "memory = 562949953421312.B"
cpu1: "cpus = 1"
cpu2: "cpus = 2"
cpu5: "cpus = 5"
cpu10: "cpus = 10"
cpu20: "cpus = 20"
cpu50: "cpus = 50"
cpu100: "cpus = 100"
cpu200: "cpus = 200"
cpu500: "cpus = 500"
cpu1000: "cpus = 1000"
script:
- "includeConfig(\"nextflow_labels.config\")"
debug: false
container: "docker"
engines:
- type: "docker"
id: "docker"
image: "python:3.12-slim"
target_registry: "images.viash-hub.com"
target_tag: "niche-compass"
namespace_separator: "/"
setup:
- type: "apt"
packages:
- "procps"
- "git"
interactive: false
- type: "python"
user: false
packages:
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
- "viashpy==0.9.0"
github:
- "openpipelines-bio/core#subdirectory=packages/python/openpipeline_testutils"
script:
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
entrypoint: []
cmd: null
- type: "native"
id: "native"
build_info:
config: "src/workflows/test_workflows/ingestion/spaceranger_mapping_test/config.vsh.yaml"
runner: "executable"
engine: "docker|native"
output: "target/_test/executable/test_workflows/ingestion/spaceranger_mapping_test"
executable: "target/_test/executable/test_workflows/ingestion/spaceranger_mapping_test/spaceranger_mapping_test"
viash_version: "0.9.4"
git_commit: "a4d81924a673566026c204c55add247504ef1c56"
git_remote: "https://github.com/openpipelines-bio/openpipeline_spatial"
package_config:
name: "openpipeline_spatial"
version: "niche-compass"
info:
test_resources:
- type: "s3"
path: "s3://openpipelines-bio/openpipeline_spatial/resources_test"
dest: "resources_test"
repositories:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v4.0.0"
viash_version: "0.9.4"
source: "src"
target: "target"
config_mods:
- ".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n\
.runners[.type == 'nextflow'].config.script := 'includeConfig(\"nextflow_labels.config\"\
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'niche-compass'"
organization: "vsh"
links:
repository: "https://github.com/openpipelines-bio/openpipeline_spatial"
docker_registry: "ghcr.io"

View File

@@ -47,7 +47,7 @@ repositories:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v3.0.0"
tag: "v4.0.0"
links:
repository: "https://github.com/openpipelines-bio/openpipeline_spatial"
docker_registry: "ghcr.io"
@@ -134,14 +134,16 @@ engines:
- type: "python"
user: false
packages:
- "anndata~=0.11.1"
- "mudata~=0.3.1"
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
- "viashpy==0.9.0"
github:
- "openpipelines-bio/core#subdirectory=packages/python/openpipeline_testutils"
script:
- "exec(\"try:\\n import awkward\\nexcept ModuleNotFoundError:\\n exit(0)\\\
nelse: exit(1)\")"
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
entrypoint: []
cmd: null
@@ -154,7 +156,7 @@ build_info:
output: "target/_test/executable/test_workflows/niche/nichecompass_leiden_test"
executable: "target/_test/executable/test_workflows/niche/nichecompass_leiden_test/nichecompass_leiden_test"
viash_version: "0.9.4"
git_commit: "f91eceb7cf408169b2847c359c6e2acd77856ff7"
git_commit: "a4d81924a673566026c204c55add247504ef1c56"
git_remote: "https://github.com/openpipelines-bio/openpipeline_spatial"
package_config:
name: "openpipeline_spatial"
@@ -168,7 +170,7 @@ package_config:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v3.0.0"
tag: "v4.0.0"
viash_version: "0.9.4"
source: "src"
target: "target"

View File

@@ -453,15 +453,15 @@ RUN apt-get update && \
rm -rf /var/lib/apt/lists/*
RUN pip install --upgrade pip && \
pip install --upgrade --no-cache-dir "anndata~=0.11.1" "mudata~=0.3.1" "viashpy==0.9.0" && \
pip install --upgrade --no-cache-dir "anndata~=0.12.7" "awkward" "mudata~=0.3.2" "viashpy==0.9.0" && \
pip install --upgrade --no-cache-dir "git+https://github.com/openpipelines-bio/core#subdirectory=packages/python/openpipeline_testutils" && \
python -c 'exec("try:\n import awkward\nexcept ModuleNotFoundError:\n exit(0)\nelse: exit(1)")'
python -c 'exec("try:\n import zarr; from importlib.metadata import version\nexcept ModuleNotFoundError:\n exit(0)\nelse: assert int(version(\"zarr\").partition(\".\")[0]) > 2")'
LABEL org.opencontainers.image.authors="Dorien Roosen"
LABEL org.opencontainers.image.description="Companion container for running component test_workflows/niche nichecompass_leiden_test"
LABEL org.opencontainers.image.created="2026-01-26T08:53:42Z"
LABEL org.opencontainers.image.created="2026-01-27T10:47:55Z"
LABEL org.opencontainers.image.source="https://github.com/openpipelines-bio/openpipeline_spatial"
LABEL org.opencontainers.image.revision="f91eceb7cf408169b2847c359c6e2acd77856ff7"
LABEL org.opencontainers.image.revision="a4d81924a673566026c204c55add247504ef1c56"
LABEL org.opencontainers.image.version="niche-compass"
VIASHDOCKER

View File

@@ -0,0 +1,188 @@
name: "spaceranger_mapping_test"
namespace: "test_workflows/ingestion"
version: "niche-compass"
authors:
- name: "Dorien Roosen"
info:
role: "Core Team Member"
links:
email: "dorien@data-intuitive.com"
github: "dorien-er"
linkedin: "dorien-roosen"
organizations:
- name: "Data Intuitive"
href: "https://www.data-intuitive.com"
role: "Data Scientist"
argument_groups:
- name: "Inputs"
arguments:
- type: "file"
name: "--input"
description: "Path to h5mu output."
info: null
example:
- "foo.final.h5mu"
must_exist: true
create_parent: true
required: true
direction: "input"
multiple: false
multiple_sep: ";"
resources:
- type: "python_script"
path: "script.py"
is_executable: true
- type: "file"
path: "setup_logger.py"
- type: "file"
path: "nextflow_labels.config"
dest: "nextflow_labels.config"
description: "This component test the output of the integration test of the spaceranger\
\ mapping workflow."
info: null
status: "enabled"
scope:
image: "test"
target: "test"
repositories:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v4.0.0"
links:
repository: "https://github.com/openpipelines-bio/openpipeline_spatial"
docker_registry: "ghcr.io"
runners:
- type: "executable"
id: "executable"
docker_setup_strategy: "ifneedbepullelsecachedbuild"
- type: "nextflow"
id: "nextflow"
directives:
tag: "$id"
auto:
simplifyInput: true
simplifyOutput: false
transcript: false
publish: false
config:
labels:
mem1gb: "memory = 1000000000.B"
mem2gb: "memory = 2000000000.B"
mem5gb: "memory = 5000000000.B"
mem10gb: "memory = 10000000000.B"
mem20gb: "memory = 20000000000.B"
mem50gb: "memory = 50000000000.B"
mem100gb: "memory = 100000000000.B"
mem200gb: "memory = 200000000000.B"
mem500gb: "memory = 500000000000.B"
mem1tb: "memory = 1000000000000.B"
mem2tb: "memory = 2000000000000.B"
mem5tb: "memory = 5000000000000.B"
mem10tb: "memory = 10000000000000.B"
mem20tb: "memory = 20000000000000.B"
mem50tb: "memory = 50000000000000.B"
mem100tb: "memory = 100000000000000.B"
mem200tb: "memory = 200000000000000.B"
mem500tb: "memory = 500000000000000.B"
mem1gib: "memory = 1073741824.B"
mem2gib: "memory = 2147483648.B"
mem4gib: "memory = 4294967296.B"
mem8gib: "memory = 8589934592.B"
mem16gib: "memory = 17179869184.B"
mem32gib: "memory = 34359738368.B"
mem64gib: "memory = 68719476736.B"
mem128gib: "memory = 137438953472.B"
mem256gib: "memory = 274877906944.B"
mem512gib: "memory = 549755813888.B"
mem1tib: "memory = 1099511627776.B"
mem2tib: "memory = 2199023255552.B"
mem4tib: "memory = 4398046511104.B"
mem8tib: "memory = 8796093022208.B"
mem16tib: "memory = 17592186044416.B"
mem32tib: "memory = 35184372088832.B"
mem64tib: "memory = 70368744177664.B"
mem128tib: "memory = 140737488355328.B"
mem256tib: "memory = 281474976710656.B"
mem512tib: "memory = 562949953421312.B"
cpu1: "cpus = 1"
cpu2: "cpus = 2"
cpu5: "cpus = 5"
cpu10: "cpus = 10"
cpu20: "cpus = 20"
cpu50: "cpus = 50"
cpu100: "cpus = 100"
cpu200: "cpus = 200"
cpu500: "cpus = 500"
cpu1000: "cpus = 1000"
script:
- "includeConfig(\"nextflow_labels.config\")"
debug: false
container: "docker"
engines:
- type: "docker"
id: "docker"
image: "python:3.12-slim"
target_registry: "images.viash-hub.com"
target_tag: "niche-compass"
namespace_separator: "/"
setup:
- type: "apt"
packages:
- "procps"
- "git"
interactive: false
- type: "python"
user: false
packages:
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
- "viashpy==0.9.0"
github:
- "openpipelines-bio/core#subdirectory=packages/python/openpipeline_testutils"
script:
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
entrypoint: []
cmd: null
- type: "native"
id: "native"
build_info:
config: "src/workflows/test_workflows/ingestion/spaceranger_mapping_test/config.vsh.yaml"
runner: "nextflow"
engine: "docker|native"
output: "target/_test/nextflow/test_workflows/ingestion/spaceranger_mapping_test"
executable: "target/_test/nextflow/test_workflows/ingestion/spaceranger_mapping_test/main.nf"
viash_version: "0.9.4"
git_commit: "a4d81924a673566026c204c55add247504ef1c56"
git_remote: "https://github.com/openpipelines-bio/openpipeline_spatial"
package_config:
name: "openpipeline_spatial"
version: "niche-compass"
info:
test_resources:
- type: "s3"
path: "s3://openpipelines-bio/openpipeline_spatial/resources_test"
dest: "resources_test"
repositories:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v4.0.0"
viash_version: "0.9.4"
source: "src"
target: "target"
config_mods:
- ".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n\
.runners[.type == 'nextflow'].config.script := 'includeConfig(\"nextflow_labels.config\"\
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'niche-compass'"
organization: "vsh"
links:
repository: "https://github.com/openpipelines-bio/openpipeline_spatial"
docker_registry: "ghcr.io"

File diff suppressed because it is too large Load Diff

View File

@@ -1,10 +1,10 @@
manifest {
name = 'dataflow/concatenate_h5mu'
name = 'test_workflows/ingestion/spaceranger_mapping_test'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'niche-compass'
description = 'Concatenate observations from samples in several (uni- and/or multi-modal) MuData files into a single file.\n'
author = 'Dries Schaumont'
description = 'This component test the output of the integration test of the spaceranger mapping workflow.'
author = 'Dorien Roosen'
}
process.container = 'nextflow/bash:latest'

View File

@@ -47,7 +47,7 @@ repositories:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v3.0.0"
tag: "v4.0.0"
links:
repository: "https://github.com/openpipelines-bio/openpipeline_spatial"
docker_registry: "ghcr.io"
@@ -134,14 +134,16 @@ engines:
- type: "python"
user: false
packages:
- "anndata~=0.11.1"
- "mudata~=0.3.1"
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
- "viashpy==0.9.0"
github:
- "openpipelines-bio/core#subdirectory=packages/python/openpipeline_testutils"
script:
- "exec(\"try:\\n import awkward\\nexcept ModuleNotFoundError:\\n exit(0)\\\
nelse: exit(1)\")"
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
entrypoint: []
cmd: null
@@ -154,7 +156,7 @@ build_info:
output: "target/_test/nextflow/test_workflows/niche/nichecompass_leiden_test"
executable: "target/_test/nextflow/test_workflows/niche/nichecompass_leiden_test/main.nf"
viash_version: "0.9.4"
git_commit: "f91eceb7cf408169b2847c359c6e2acd77856ff7"
git_commit: "a4d81924a673566026c204c55add247504ef1c56"
git_remote: "https://github.com/openpipelines-bio/openpipeline_spatial"
package_config:
name: "openpipeline_spatial"
@@ -168,7 +170,7 @@ package_config:
- type: "vsh"
name: "openpipeline"
repo: "openpipeline"
tag: "v3.0.0"
tag: "v4.0.0"
viash_version: "0.9.4"
source: "src"
target: "target"

View File

@@ -3104,7 +3104,7 @@ meta = [
"type" : "vsh",
"name" : "openpipeline",
"repo" : "openpipeline",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
}
],
"links" : {
@@ -3209,15 +3209,16 @@ meta = [
"type" : "python",
"user" : false,
"packages" : [
"anndata~=0.11.1",
"mudata~=0.3.1",
"anndata~=0.12.7",
"awkward",
"mudata~=0.3.2",
"viashpy==0.9.0"
],
"github" : [
"openpipelines-bio/core#subdirectory=packages/python/openpipeline_testutils"
],
"script" : [
"exec(\\"try:\\\\n import awkward\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: exit(1)\\")"
"exec(\\"try:\\\\n import zarr; from importlib.metadata import version\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: assert int(version(\\\\\\"zarr\\\\\\").partition(\\\\\\".\\\\\\")[0]) > 2\\")"
],
"upgrade" : true
}
@@ -3234,7 +3235,7 @@ meta = [
"engine" : "docker|native",
"output" : "/workdir/root/repo/target/_test/nextflow/test_workflows/niche/nichecompass_leiden_test",
"viash_version" : "0.9.4",
"git_commit" : "f91eceb7cf408169b2847c359c6e2acd77856ff7",
"git_commit" : "a4d81924a673566026c204c55add247504ef1c56",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline_spatial"
},
"package_config" : {
@@ -3254,7 +3255,7 @@ meta = [
"type" : "vsh",
"name" : "openpipeline",
"repo" : "openpipeline",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
}
],
"viash_version" : "0.9.4",

View File

@@ -1,110 +0,0 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "concatenate_h5mu",
"description": "Concatenate observations from samples in several (uni- and/or multi-modal) MuData files into a single file.\n",
"type": "object",
"$defs": {
"arguments": {
"title": "Arguments",
"type": "object",
"description": "No description",
"properties": {
"input": {
"type": "array",
"items": {
"type": "string"
},
"format": "path",
"exists": true,
"description": "Paths to the different samples to be concatenated.",
"help_text": "Type: `file`, multiple: `True`, required, direction: `input`, example: `[\"sample_paths\"]`. "
},
"modality": {
"type": "array",
"items": {
"type": "string"
},
"description": "Only output concatenated objects for the provided modalities",
"help_text": "Type: `string`, multiple: `True`. "
},
"input_id": {
"type": "array",
"items": {
"type": "string"
},
"description": "Names of the different samples that have to be concatenated",
"help_text": "Type: `string`, multiple: `True`. "
},
"output": {
"type": "string",
"format": "path",
"description": "Output location for the concatenated MuData object file.\n",
"help_text": "Type: `file`, multiple: `False`, default: `\"$id.$key.output.h5mu\"`, direction: `output`, example: `\"output.h5mu\"`. ",
"default": "$id.$key.output.h5mu"
},
"obs_sample_name": {
"type": "string",
"description": "Name of the .obs key under which to add the sample names.",
"help_text": "Type: `string`, multiple: `False`, default: `\"sample_id\"`. ",
"default": "sample_id"
},
"other_axis_mode": {
"type": "string",
"description": "How to handle the merging of other axis (var, obs, ...).\n\n - None: keep no data\n - same: only keep elements of the matrices which are the same in each of the samples\n - unique: only keep elements for which there is only 1 possible value (1 value that can occur in multiple samples)\n - first: keep the annotation from the first sample\n - only: keep elements that show up in only one of the objects (1 unique element in only 1 sample)\n - move: identical to 'same', but moving the conflicting values to .varm or .obsm\n",
"help_text": "Type: `string`, multiple: `False`, default: `\"move\"`, choices: ``same`, `unique`, `first`, `only`, `concat`, `move``. ",
"enum": [
"same",
"unique",
"first",
"only",
"concat",
"move"
],
"default": "move"
},
"uns_merge_mode": {
"type": "string",
"description": "How to handle the merging of .uns across modalities\n - None: keep no data\n - same: only keep elements of the matrices which are the same in each of the samples\n - unique: only keep elements for which there is only 1 possible value (1 value that can occur in multiple samples)\n - first: keep the annotation from the first sample\n - only: keep elements that show up in only one of the objects (1 unique element in only 1 sample)\n - make_unique: identical to 'unique', but keys which are not unique are made unique by prefixing them with the sample id.\n",
"help_text": "Type: `string`, multiple: `False`, default: `\"make_unique\"`, choices: ``same`, `unique`, `first`, `only`, `make_unique``. ",
"enum": [
"same",
"unique",
"first",
"only",
"make_unique"
],
"default": "make_unique"
},
"output_compression": {
"type": "string",
"description": "Compression format to use for the output AnnData and/or Mudata objects.\nBy default no compression is applied.\n",
"help_text": "Type: `string`, multiple: `False`, example: `\"gzip\"`, choices: ``gzip`, `lzf``. ",
"enum": [
"gzip",
"lzf"
]
}
}
},
"nextflow input-output arguments": {
"title": "Nextflow input-output arguments",
"type": "object",
"description": "Input/output parameters for Nextflow itself. Please note that both publishDir and publish_dir are supported but at least one has to be configured.",
"properties": {
"publish_dir": {
"type": "string",
"description": "Path to an output directory.",
"help_text": "Type: `string`, multiple: `False`, required, example: `\"output/\"`. "
}
}
}
},
"allOf": [
{
"$ref": "#/$defs/arguments"
},
{
"$ref": "#/$defs/nextflow input-output arguments"
}
]
}

View File

@@ -1,6 +1,6 @@
name: "split_modalities"
namespace: "workflows/multiomics"
version: "v3.0.0"
version: "v4.0.0"
authors:
- name: "Dries Schaumont"
roles:
@@ -183,14 +183,13 @@ build_info:
output: "target/_private/nextflow/workflows/multiomics/split_modalities"
executable: "target/_private/nextflow/workflows/multiomics/split_modalities/main.nf"
viash_version: "0.9.4"
git_commit: "e92e56b49125af8ef2ebb11586191a6cbf9a8457"
git_commit: "de02293c9e13198622b988dac952b2c8c70a1e35"
git_remote: "https://github.com/openpipelines-bio/openpipeline"
git_tag: "0.2.0-2059-ge92e56b4"
dependencies:
- "target/nextflow/dataflow/split_modalities"
package_config:
name: "openpipeline"
version: "v3.0.0"
version: "v4.0.0"
summary: "Best-practice workflows for single-cell multi-omics analyses.\n"
description: "OpenPipelines are extensible single cell analysis pipelines for reproducible\
\ and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\
@@ -220,7 +219,7 @@ package_config:
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v3.0.0'"
- ".engines[.type == 'docker'].target_tag := 'v4.0.0'"
keywords:
- "single-cell"
- "multimodal"

View File

@@ -1,4 +1,4 @@
// split_modalities v3.0.0
// split_modalities v4.0.0
//
// This wrapper script is auto-generated by viash 0.9.4 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
@@ -3035,7 +3035,7 @@ meta = [
"config": processConfig(readJsonBlob('''{
"name" : "split_modalities",
"namespace" : "workflows/multiomics",
"version" : "v3.0.0",
"version" : "v4.0.0",
"authors" : [
{
"name" : "Dries Schaumont",
@@ -3274,13 +3274,12 @@ meta = [
"engine" : "native",
"output" : "/workdir/root/repo/target/_private/nextflow/workflows/multiomics/split_modalities",
"viash_version" : "0.9.4",
"git_commit" : "e92e56b49125af8ef2ebb11586191a6cbf9a8457",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline",
"git_tag" : "0.2.0-2059-ge92e56b4"
"git_commit" : "de02293c9e13198622b988dac952b2c8c70a1e35",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline"
},
"package_config" : {
"name" : "openpipeline",
"version" : "v3.0.0",
"version" : "v4.0.0",
"summary" : "Best-practice workflows for single-cell multi-omics analyses.\n",
"description" : "OpenPipelines are extensible single cell analysis pipelines for reproducible and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\nIn terms of workflows, the following has been made available, but keep in mind that\nindividual tools and functionality can be executed as standalone components as well.\n\n * Demultiplexing: conversion of raw sequencing data to FASTQ objects.\n * Ingestion: Read mapping and generating a count matrix.\n * Single sample processing: cell filtering and doublet detection.\n * Multisample processing: Count transformation, normalization, QC metric calulations.\n * Integration: Clustering, integration and batch correction using single and multimodal methods.\n * Downstream analysis workflows\n",
"info" : {
@@ -3305,7 +3304,7 @@ meta = [
".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n.runners[.type == 'nextflow'].config.script := 'includeConfig(\\"nextflow_labels.config\\")'",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v3.0.0'"
".engines[.type == 'docker'].target_tag := 'v4.0.0'"
],
"keywords" : [
"single-cell",

View File

@@ -2,7 +2,7 @@ manifest {
name = 'workflows/multiomics/split_modalities'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v3.0.0'
version = 'v4.0.0'
description = 'A pipeline to split a multimodal mudata files into several unimodal mudata files.'
author = 'Dries Schaumont'
}

View File

@@ -0,0 +1,250 @@
name: "log_normalize"
namespace: "workflows/rna"
version: "v4.0.0"
authors:
- name: "Dries Schaumont"
roles:
- "author"
info:
role: "Core Team Member"
links:
email: "dries@data-intuitive.com"
github: "DriesSchaumont"
orcid: "0000-0002-4389-0440"
linkedin: "dries-schaumont"
organizations:
- name: "Data Intuitive"
href: "https://www.data-intuitive.com"
role: "Data Scientist"
argument_groups:
- name: "Inputs"
arguments:
- type: "file"
name: "--input"
description: "MuData file to transform."
info: null
example:
- "dataset.h5mu"
must_exist: true
create_parent: true
required: true
direction: "input"
multiple: false
multiple_sep: ";"
- type: "string"
name: "--modality"
description: "Modality to process."
info: null
default:
- "rna"
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- type: "string"
name: "--layer"
description: "Input layer containing raw counts. If not specified, .X is used."
info: null
required: false
direction: "input"
multiple: false
multiple_sep: ";"
- name: "Transormation options"
arguments:
- type: "integer"
name: "--target_sum"
description: "Normalize total counts to the specified amount. If not set, after\
\ normalization each observation (cell) \nwill have a total count equal to the\
\ median of total counts for observations (cells) before normalization.\n"
info: null
required: false
min: 1
direction: "input"
multiple: false
multiple_sep: ";"
- name: "Output slots"
arguments:
- type: "string"
name: "--output_layer"
description: "Layer to write the log-transformed counts to.\n"
info: null
required: true
direction: "input"
multiple: false
multiple_sep: ";"
- name: "Output"
arguments:
- type: "file"
name: "--output"
description: "Destination path to the output."
info: null
example:
- "output.h5mu"
must_exist: true
create_parent: true
required: true
direction: "output"
multiple: false
multiple_sep: ";"
resources:
- type: "nextflow_script"
path: "main.nf"
is_executable: true
entrypoint: "run_wf"
- type: "file"
path: "utils"
- type: "file"
path: "nextflow_labels.config"
dest: "nextflow_labels.config"
description: "Performs normalization and subsequent log-transformation of raw count\
\ data."
test_resources:
- type: "nextflow_script"
path: "test.nf"
is_executable: true
entrypoint: "test_wf"
- type: "file"
path: "pbmc_1k_protein_v3"
info: null
status: "enabled"
scope:
image: "private"
target: "private"
dependencies:
- name: "transform/normalize_total"
repository:
type: "local"
- name: "transform/log1p"
repository:
type: "local"
- name: "transform/delete_layer"
repository:
type: "local"
license: "MIT"
links:
repository: "https://github.com/openpipelines-bio/openpipeline"
docker_registry: "ghcr.io"
runners:
- type: "nextflow"
id: "nextflow"
directives:
tag: "$id"
auto:
simplifyInput: true
simplifyOutput: false
transcript: false
publish: false
config:
labels:
mem1gb: "memory = 1000000000.B"
mem2gb: "memory = 2000000000.B"
mem5gb: "memory = 5000000000.B"
mem10gb: "memory = 10000000000.B"
mem20gb: "memory = 20000000000.B"
mem50gb: "memory = 50000000000.B"
mem100gb: "memory = 100000000000.B"
mem200gb: "memory = 200000000000.B"
mem500gb: "memory = 500000000000.B"
mem1tb: "memory = 1000000000000.B"
mem2tb: "memory = 2000000000000.B"
mem5tb: "memory = 5000000000000.B"
mem10tb: "memory = 10000000000000.B"
mem20tb: "memory = 20000000000000.B"
mem50tb: "memory = 50000000000000.B"
mem100tb: "memory = 100000000000000.B"
mem200tb: "memory = 200000000000000.B"
mem500tb: "memory = 500000000000000.B"
mem1gib: "memory = 1073741824.B"
mem2gib: "memory = 2147483648.B"
mem4gib: "memory = 4294967296.B"
mem8gib: "memory = 8589934592.B"
mem16gib: "memory = 17179869184.B"
mem32gib: "memory = 34359738368.B"
mem64gib: "memory = 68719476736.B"
mem128gib: "memory = 137438953472.B"
mem256gib: "memory = 274877906944.B"
mem512gib: "memory = 549755813888.B"
mem1tib: "memory = 1099511627776.B"
mem2tib: "memory = 2199023255552.B"
mem4tib: "memory = 4398046511104.B"
mem8tib: "memory = 8796093022208.B"
mem16tib: "memory = 17592186044416.B"
mem32tib: "memory = 35184372088832.B"
mem64tib: "memory = 70368744177664.B"
mem128tib: "memory = 140737488355328.B"
mem256tib: "memory = 281474976710656.B"
mem512tib: "memory = 562949953421312.B"
cpu1: "cpus = 1"
cpu2: "cpus = 2"
cpu5: "cpus = 5"
cpu10: "cpus = 10"
cpu20: "cpus = 20"
cpu50: "cpus = 50"
cpu100: "cpus = 100"
cpu200: "cpus = 200"
cpu500: "cpus = 500"
cpu1000: "cpus = 1000"
script:
- "includeConfig(\"nextflow_labels.config\")"
debug: false
container: "docker"
engines:
- type: "native"
id: "native"
build_info:
config: "src/workflows/rna/log_normalize/config.vsh.yaml"
runner: "nextflow"
engine: "native"
output: "target/_private/nextflow/workflows/rna/log_normalize"
executable: "target/_private/nextflow/workflows/rna/log_normalize/main.nf"
viash_version: "0.9.4"
git_commit: "de02293c9e13198622b988dac952b2c8c70a1e35"
git_remote: "https://github.com/openpipelines-bio/openpipeline"
dependencies:
- "target/nextflow/transform/normalize_total"
- "target/nextflow/transform/log1p"
- "target/nextflow/transform/delete_layer"
package_config:
name: "openpipeline"
version: "v4.0.0"
summary: "Best-practice workflows for single-cell multi-omics analyses.\n"
description: "OpenPipelines are extensible single cell analysis pipelines for reproducible\
\ and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\
\nIn terms of workflows, the following has been made available, but keep in mind\
\ that\nindividual tools and functionality can be executed as standalone components\
\ as well.\n\n * Demultiplexing: conversion of raw sequencing data to FASTQ objects.\n\
\ * Ingestion: Read mapping and generating a count matrix.\n * Single sample\
\ processing: cell filtering and doublet detection.\n * Multisample processing:\
\ Count transformation, normalization, QC metric calulations.\n * Integration:\
\ Clustering, integration and batch correction using single and multimodal methods.\n\
\ * Downstream analysis workflows\n"
info:
test_resources:
- type: "s3"
path: "s3://openpipelines-data"
dest: "resources_test"
nextflow_labels_ci:
- path: "src/workflows/utils/labels_ci.config"
description: "Adds the correct memory and CPU labels when running on the Viash\
\ Hub CI."
viash_version: "0.9.4"
source: "src"
target: "target"
config_mods:
- ".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n\
.runners[.type == 'nextflow'].config.script := 'includeConfig(\"nextflow_labels.config\"\
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v4.0.0'"
keywords:
- "single-cell"
- "multimodal"
license: "MIT"
organization: "vsh"
links:
repository: "https://github.com/openpipelines-bio/openpipeline"
docker_registry: "ghcr.io"
homepage: "https://openpipelines.bio"
documentation: "https://openpipelines.bio/fundamentals"
issue_tracker: "https://github.com/openpipelines-bio/openpipeline/issues"

View File

@@ -0,0 +1,126 @@
manifest {
name = 'workflows/rna/log_normalize'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v4.0.0'
description = 'Performs normalization and subsequent log-transformation of raw count data.'
author = 'Dries Schaumont'
}
process.container = 'nextflow/bash:latest'
// detect tempdir
tempDir = java.nio.file.Paths.get(
System.getenv('NXF_TEMP') ?:
System.getenv('VIASH_TEMP') ?:
System.getenv('TEMPDIR') ?:
System.getenv('TMPDIR') ?:
'/tmp'
).toAbsolutePath()
profiles {
no_publish {
process {
withName: '.*' {
publishDir = [
enabled: false
]
}
}
}
mount_temp {
docker.temp = tempDir
podman.temp = tempDir
charliecloud.temp = tempDir
}
docker {
docker.enabled = true
// docker.userEmulation = true
singularity.enabled = false
podman.enabled = false
shifter.enabled = false
charliecloud.enabled = false
}
singularity {
singularity.enabled = true
singularity.autoMounts = true
docker.enabled = false
podman.enabled = false
shifter.enabled = false
charliecloud.enabled = false
}
podman {
podman.enabled = true
docker.enabled = false
singularity.enabled = false
shifter.enabled = false
charliecloud.enabled = false
}
shifter {
shifter.enabled = true
docker.enabled = false
singularity.enabled = false
podman.enabled = false
charliecloud.enabled = false
}
charliecloud {
charliecloud.enabled = true
docker.enabled = false
singularity.enabled = false
podman.enabled = false
shifter.enabled = false
}
}
process{
withLabel: mem1gb { memory = 1000000000.B }
withLabel: mem2gb { memory = 2000000000.B }
withLabel: mem5gb { memory = 5000000000.B }
withLabel: mem10gb { memory = 10000000000.B }
withLabel: mem20gb { memory = 20000000000.B }
withLabel: mem50gb { memory = 50000000000.B }
withLabel: mem100gb { memory = 100000000000.B }
withLabel: mem200gb { memory = 200000000000.B }
withLabel: mem500gb { memory = 500000000000.B }
withLabel: mem1tb { memory = 1000000000000.B }
withLabel: mem2tb { memory = 2000000000000.B }
withLabel: mem5tb { memory = 5000000000000.B }
withLabel: mem10tb { memory = 10000000000000.B }
withLabel: mem20tb { memory = 20000000000000.B }
withLabel: mem50tb { memory = 50000000000000.B }
withLabel: mem100tb { memory = 100000000000000.B }
withLabel: mem200tb { memory = 200000000000000.B }
withLabel: mem500tb { memory = 500000000000000.B }
withLabel: mem1gib { memory = 1073741824.B }
withLabel: mem2gib { memory = 2147483648.B }
withLabel: mem4gib { memory = 4294967296.B }
withLabel: mem8gib { memory = 8589934592.B }
withLabel: mem16gib { memory = 17179869184.B }
withLabel: mem32gib { memory = 34359738368.B }
withLabel: mem64gib { memory = 68719476736.B }
withLabel: mem128gib { memory = 137438953472.B }
withLabel: mem256gib { memory = 274877906944.B }
withLabel: mem512gib { memory = 549755813888.B }
withLabel: mem1tib { memory = 1099511627776.B }
withLabel: mem2tib { memory = 2199023255552.B }
withLabel: mem4tib { memory = 4398046511104.B }
withLabel: mem8tib { memory = 8796093022208.B }
withLabel: mem16tib { memory = 17592186044416.B }
withLabel: mem32tib { memory = 35184372088832.B }
withLabel: mem64tib { memory = 70368744177664.B }
withLabel: mem128tib { memory = 140737488355328.B }
withLabel: mem256tib { memory = 281474976710656.B }
withLabel: mem512tib { memory = 562949953421312.B }
withLabel: cpu1 { cpus = 1 }
withLabel: cpu2 { cpus = 2 }
withLabel: cpu5 { cpus = 5 }
withLabel: cpu10 { cpus = 10 }
withLabel: cpu20 { cpus = 20 }
withLabel: cpu50 { cpus = 50 }
withLabel: cpu100 { cpus = 100 }
withLabel: cpu200 { cpus = 200 }
withLabel: cpu500 { cpus = 500 }
withLabel: cpu1000 { cpus = 1000 }
}
includeConfig("nextflow_labels.config")

View File

@@ -1,6 +1,6 @@
name: "leiden"
namespace: "cluster"
version: "v3.0.0"
version: "v4.0.0"
authors:
- name: "Dries De Maeyer"
roles:
@@ -215,7 +215,7 @@ engines:
id: "docker"
image: "python:3.13-slim"
target_registry: "images.viash-hub.com"
target_tag: "v3.0.0"
target_tag: "v4.0.0"
namespace_separator: "/"
setup:
- type: "apt"
@@ -225,13 +225,15 @@ engines:
- type: "python"
user: false
packages:
- "anndata~=0.11.1"
- "mudata~=0.3.1"
- "scanpy~=1.10.4"
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
- "scanpy~=1.11.4"
- "leidenalg~=0.10.0"
script:
- "exec(\"try:\\n import awkward\\nexcept ModuleNotFoundError:\\n exit(0)\\\
nelse: exit(1)\")"
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
test_setup:
- type: "apt"
@@ -256,12 +258,11 @@ build_info:
output: "target/nextflow/cluster/leiden"
executable: "target/nextflow/cluster/leiden/main.nf"
viash_version: "0.9.4"
git_commit: "e92e56b49125af8ef2ebb11586191a6cbf9a8457"
git_commit: "de02293c9e13198622b988dac952b2c8c70a1e35"
git_remote: "https://github.com/openpipelines-bio/openpipeline"
git_tag: "0.2.0-2059-ge92e56b4"
package_config:
name: "openpipeline"
version: "v3.0.0"
version: "v4.0.0"
summary: "Best-practice workflows for single-cell multi-omics analyses.\n"
description: "OpenPipelines are extensible single cell analysis pipelines for reproducible\
\ and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\
@@ -291,7 +292,7 @@ package_config:
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v3.0.0'"
- ".engines[.type == 'docker'].target_tag := 'v4.0.0'"
keywords:
- "single-cell"
- "multimodal"

View File

@@ -1,4 +1,4 @@
// leiden v3.0.0
// leiden v4.0.0
//
// This wrapper script is auto-generated by viash 0.9.4 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
@@ -3035,7 +3035,7 @@ meta = [
"config": processConfig(readJsonBlob('''{
"name" : "leiden",
"namespace" : "cluster",
"version" : "v3.0.0",
"version" : "v4.0.0",
"authors" : [
{
"name" : "Dries De Maeyer",
@@ -3294,7 +3294,7 @@ meta = [
"id" : "docker",
"image" : "python:3.13-slim",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v3.0.0",
"target_tag" : "v4.0.0",
"namespace_separator" : "/",
"setup" : [
{
@@ -3308,13 +3308,14 @@ meta = [
"type" : "python",
"user" : false,
"packages" : [
"anndata~=0.11.1",
"mudata~=0.3.1",
"scanpy~=1.10.4",
"anndata~=0.12.7",
"awkward",
"mudata~=0.3.2",
"scanpy~=1.11.4",
"leidenalg~=0.10.0"
],
"script" : [
"exec(\\"try:\\\\n import awkward\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: exit(1)\\")"
"exec(\\"try:\\\\n import zarr; from importlib.metadata import version\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: assert int(version(\\\\\\"zarr\\\\\\").partition(\\\\\\".\\\\\\")[0]) > 2\\")"
],
"upgrade" : true
}
@@ -3351,13 +3352,12 @@ meta = [
"engine" : "docker|native",
"output" : "/workdir/root/repo/target/nextflow/cluster/leiden",
"viash_version" : "0.9.4",
"git_commit" : "e92e56b49125af8ef2ebb11586191a6cbf9a8457",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline",
"git_tag" : "0.2.0-2059-ge92e56b4"
"git_commit" : "de02293c9e13198622b988dac952b2c8c70a1e35",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline"
},
"package_config" : {
"name" : "openpipeline",
"version" : "v3.0.0",
"version" : "v4.0.0",
"summary" : "Best-practice workflows for single-cell multi-omics analyses.\n",
"description" : "OpenPipelines are extensible single cell analysis pipelines for reproducible and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\nIn terms of workflows, the following has been made available, but keep in mind that\nindividual tools and functionality can be executed as standalone components as well.\n\n * Demultiplexing: conversion of raw sequencing data to FASTQ objects.\n * Ingestion: Read mapping and generating a count matrix.\n * Single sample processing: cell filtering and doublet detection.\n * Multisample processing: Count transformation, normalization, QC metric calulations.\n * Integration: Clustering, integration and batch correction using single and multimodal methods.\n * Downstream analysis workflows\n",
"info" : {
@@ -3382,7 +3382,7 @@ meta = [
".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n.runners[.type == 'nextflow'].config.script := 'includeConfig(\\"nextflow_labels.config\\")'",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v3.0.0'"
".engines[.type == 'docker'].target_tag := 'v4.0.0'"
],
"keywords" : [
"single-cell",
@@ -3417,6 +3417,7 @@ import time
import logging
import logging.handlers
import warnings
import h5py
import mudata as mu
import pandas as pd
import scanpy as sc
@@ -3734,7 +3735,8 @@ def main():
logger.info("Waiting for shutdown of processes")
executor.shutdown()
logger.info("Executor shut down.")
adata.obsm[par["obsm_name"]] = pd.DataFrame(results)
del adata
results = pd.DataFrame(results)
output_file = Path(par["output"])
logger.info("Writing output to %s.", par["output"])
@@ -3744,9 +3746,11 @@ def main():
else output_file
)
shutil.copyfile(par["input"], output_file_uncompressed)
mu.write_h5ad(
filename=output_file_uncompressed, mod=par["modality"], data=adata
)
logger.info("Opening %s", output_file_uncompressed)
with h5py.File(output_file_uncompressed, "a") as storage:
group_path = f"/mod/{par['modality']}/obsm/{par['obsm_name']}"
logger.info("Adding output to %s", group_path)
ad.io.write_elem(storage, k=group_path, elem=results)
if par["output_compression"]:
compress_h5mu(
output_file_uncompressed,
@@ -4143,7 +4147,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/openpipeline/cluster/leiden",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
},
"label" : [
"highcpu",

View File

@@ -2,7 +2,7 @@ manifest {
name = 'cluster/leiden'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v3.0.0'
version = 'v4.0.0'
description = 'Cluster cells using the [Leiden algorithm] [Traag18] implemented in the [Scanpy framework] [Wolf18]. \nLeiden is an improved version of the [Louvain algorithm] [Blondel08]. \nIt has been proposed for single-cell analysis by [Levine15] [Levine15]. \nThis requires having ran `neighbors/find_neighbors` or `neighbors/bbknn` first.\n\n[Blondel08]: Blondel et al. (2008), Fast unfolding of communities in large networks, J. Stat. Mech. \n[Levine15]: Levine et al. (2015), Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis, Cell. \n[Traag18]: Traag et al. (2018), From Louvain to Leiden: guaranteeing well-connected communities arXiv. \n[Wolf18]: Wolf et al. (2018), Scanpy: large-scale single-cell gene expression data analysis, Genome Biology. \n'
author = 'Dries De Maeyer'
}

View File

@@ -1,6 +1,6 @@
name: "concatenate_h5mu"
namespace: "dataflow"
version: "v3.0.0"
version: "v4.0.0"
authors:
- name: "Dries Schaumont"
roles:
@@ -124,6 +124,16 @@ argument_groups:
direction: "input"
multiple: false
multiple_sep: ";"
- type: "string"
name: "--obsp_keys"
description: "List of `.obsp` keys for which block-diagonal concatenation should\
\ be performed.\nIf not provided, no `.obsp` keys will be concatenated.\nProvided\
\ keys must be present in all samples for block concatenation to be performed.\n"
info: null
required: false
direction: "input"
multiple: true
multiple_sep: ";"
- type: "string"
name: "--output_compression"
description: "Compression format to use for the output AnnData and/or Mudata objects.\n\
@@ -241,9 +251,9 @@ runners:
engines:
- type: "docker"
id: "docker"
image: "python:3.11-slim"
image: "python:3.13-slim"
target_registry: "images.viash-hub.com"
target_tag: "v3.0.0"
target_tag: "v4.0.0"
namespace_separator: "/"
setup:
- type: "apt"
@@ -253,12 +263,13 @@ engines:
- type: "python"
user: false
packages:
- "anndata~=0.11.1"
- "mudata~=0.3.1"
- "pandas~=2.1.1"
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
script:
- "exec(\"try:\\n import awkward\\nexcept ModuleNotFoundError:\\n exit(0)\\\
nelse: exit(1)\")"
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
test_setup:
- type: "apt"
@@ -276,6 +287,7 @@ engines:
user: false
packages:
- "viashpy==0.8.0"
- "pytest-benchmark"
upgrade: true
entrypoint: []
cmd: null
@@ -288,12 +300,11 @@ build_info:
output: "target/nextflow/dataflow/concatenate_h5mu"
executable: "target/nextflow/dataflow/concatenate_h5mu/main.nf"
viash_version: "0.9.4"
git_commit: "e92e56b49125af8ef2ebb11586191a6cbf9a8457"
git_commit: "de02293c9e13198622b988dac952b2c8c70a1e35"
git_remote: "https://github.com/openpipelines-bio/openpipeline"
git_tag: "0.2.0-2059-ge92e56b4"
package_config:
name: "openpipeline"
version: "v3.0.0"
version: "v4.0.0"
summary: "Best-practice workflows for single-cell multi-omics analyses.\n"
description: "OpenPipelines are extensible single cell analysis pipelines for reproducible\
\ and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\
@@ -323,7 +334,7 @@ package_config:
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v3.0.0'"
- ".engines[.type == 'docker'].target_tag := 'v4.0.0'"
keywords:
- "single-cell"
- "multimodal"

View File

@@ -1,4 +1,4 @@
// concatenate_h5mu v3.0.0
// concatenate_h5mu v4.0.0
//
// This wrapper script is auto-generated by viash 0.9.4 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
@@ -3035,7 +3035,7 @@ meta = [
"config": processConfig(readJsonBlob('''{
"name" : "concatenate_h5mu",
"namespace" : "dataflow",
"version" : "v3.0.0",
"version" : "v4.0.0",
"authors" : [
{
"name" : "Dries Schaumont",
@@ -3167,6 +3167,15 @@ meta = [
"multiple" : false,
"multiple_sep" : ";"
},
{
"type" : "string",
"name" : "--obsp_keys",
"description" : "List of `.obsp` keys for which block-diagonal concatenation should be performed.\nIf not provided, no `.obsp` keys will be concatenated.\nProvided keys must be present in all samples for block concatenation to be performed.\n",
"required" : false,
"direction" : "input",
"multiple" : true,
"multiple_sep" : ";"
},
{
"type" : "string",
"name" : "--output_compression",
@@ -3317,9 +3326,9 @@ meta = [
{
"type" : "docker",
"id" : "docker",
"image" : "python:3.11-slim",
"image" : "python:3.13-slim",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v3.0.0",
"target_tag" : "v4.0.0",
"namespace_separator" : "/",
"setup" : [
{
@@ -3333,12 +3342,12 @@ meta = [
"type" : "python",
"user" : false,
"packages" : [
"anndata~=0.11.1",
"mudata~=0.3.1",
"pandas~=2.1.1"
"anndata~=0.12.7",
"awkward",
"mudata~=0.3.2"
],
"script" : [
"exec(\\"try:\\\\n import awkward\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: exit(1)\\")"
"exec(\\"try:\\\\n import zarr; from importlib.metadata import version\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: assert int(version(\\\\\\"zarr\\\\\\").partition(\\\\\\".\\\\\\")[0]) > 2\\")"
],
"upgrade" : true
}
@@ -3366,7 +3375,8 @@ meta = [
"type" : "python",
"user" : false,
"packages" : [
"viashpy==0.8.0"
"viashpy==0.8.0",
"pytest-benchmark"
],
"upgrade" : true
}
@@ -3383,13 +3393,12 @@ meta = [
"engine" : "docker|native",
"output" : "/workdir/root/repo/target/nextflow/dataflow/concatenate_h5mu",
"viash_version" : "0.9.4",
"git_commit" : "e92e56b49125af8ef2ebb11586191a6cbf9a8457",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline",
"git_tag" : "0.2.0-2059-ge92e56b4"
"git_commit" : "de02293c9e13198622b988dac952b2c8c70a1e35",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline"
},
"package_config" : {
"name" : "openpipeline",
"version" : "v3.0.0",
"version" : "v4.0.0",
"summary" : "Best-practice workflows for single-cell multi-omics analyses.\n",
"description" : "OpenPipelines are extensible single cell analysis pipelines for reproducible and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\nIn terms of workflows, the following has been made available, but keep in mind that\nindividual tools and functionality can be executed as standalone components as well.\n\n * Demultiplexing: conversion of raw sequencing data to FASTQ objects.\n * Ingestion: Read mapping and generating a count matrix.\n * Single sample processing: cell filtering and doublet detection.\n * Multisample processing: Count transformation, normalization, QC metric calulations.\n * Integration: Clustering, integration and batch correction using single and multimodal methods.\n * Downstream analysis workflows\n",
"info" : {
@@ -3414,7 +3423,7 @@ meta = [
".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n.runners[.type == 'nextflow'].config.script := 'includeConfig(\\"nextflow_labels.config\\")'",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v3.0.0'"
".engines[.type == 'docker'].target_tag := 'v4.0.0'"
],
"keywords" : [
"single-cell",
@@ -3465,6 +3474,7 @@ par = {
'obs_sample_name': $( if [ ! -z ${VIASH_PAR_OBS_SAMPLE_NAME+x} ]; then echo "r'${VIASH_PAR_OBS_SAMPLE_NAME//\\'/\\'\\"\\'\\"r\\'}'"; else echo None; fi ),
'other_axis_mode': $( if [ ! -z ${VIASH_PAR_OTHER_AXIS_MODE+x} ]; then echo "r'${VIASH_PAR_OTHER_AXIS_MODE//\\'/\\'\\"\\'\\"r\\'}'"; else echo None; fi ),
'uns_merge_mode': $( if [ ! -z ${VIASH_PAR_UNS_MERGE_MODE+x} ]; then echo "r'${VIASH_PAR_UNS_MERGE_MODE//\\'/\\'\\"\\'\\"r\\'}'"; else echo None; fi ),
'obsp_keys': $( if [ ! -z ${VIASH_PAR_OBSP_KEYS+x} ]; then echo "r'${VIASH_PAR_OBSP_KEYS//\\'/\\'\\"\\'\\"r\\'}'.split(';')"; else echo None; fi ),
'output_compression': $( if [ ! -z ${VIASH_PAR_OUTPUT_COMPRESSION+x} ]; then echo "r'${VIASH_PAR_OUTPUT_COMPRESSION//\\'/\\'\\"\\'\\"r\\'}'"; else echo None; fi )
}
meta = {
@@ -3709,8 +3719,19 @@ def concatenate_modality(
if mod is not None:
try:
data = mu.read_h5ad(input_file, mod=mod)
# Remove obsp keys that are not in par["obsp_keys"]
obsp_keys_to_keep = par.get("obsp_keys") or []
obsp_keys_to_remove = set(data.obsp.keys()) - set(obsp_keys_to_keep)
for key in obsp_keys_to_remove:
try:
del data.obsp[key]
except KeyError:
pass
mod_data[input_id] = data
mod_indices_combined = mod_indices_combined.append(data.obs.index)
except KeyError as e: # Modality does not exist for this sample, skip it
if (
f"Unable to synchronously open object (object '{mod}' doesn't exist)"
@@ -3734,6 +3755,7 @@ def concatenate_modality(
concatenated_data = anndata.concat(
mod_data.values(),
join="outer",
pairwise=True if par["obsp_keys"] else False,
merge=other_axis_mode_to_apply,
uns_merge=uns_merge_mode_to_apply,
)
@@ -4244,7 +4266,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/openpipeline/dataflow/concatenate_h5mu",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
},
"label" : [
"midcpu",

View File

@@ -2,7 +2,7 @@ manifest {
name = 'dataflow/concatenate_h5mu'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v3.0.0'
version = 'v4.0.0'
description = 'Concatenate observations from samples in several (uni- and/or multi-modal) MuData files into a single file.\n'
author = 'Dries Schaumont'
}

View File

@@ -1,6 +1,6 @@
name: "merge"
namespace: "dataflow"
version: "v3.0.0"
version: "v4.0.0"
authors:
- name: "Dries Schaumont"
roles:
@@ -163,7 +163,7 @@ engines:
id: "docker"
image: "python:3.12-slim"
target_registry: "images.viash-hub.com"
target_tag: "v3.0.0"
target_tag: "v4.0.0"
namespace_separator: "/"
setup:
- type: "apt"
@@ -173,11 +173,13 @@ engines:
- type: "python"
user: false
packages:
- "anndata~=0.11.1"
- "mudata~=0.3.1"
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
script:
- "exec(\"try:\\n import awkward\\nexcept ModuleNotFoundError:\\n exit(0)\\\
nelse: exit(1)\")"
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
test_setup:
- type: "apt"
@@ -202,12 +204,11 @@ build_info:
output: "target/nextflow/dataflow/merge"
executable: "target/nextflow/dataflow/merge/main.nf"
viash_version: "0.9.4"
git_commit: "e92e56b49125af8ef2ebb11586191a6cbf9a8457"
git_commit: "de02293c9e13198622b988dac952b2c8c70a1e35"
git_remote: "https://github.com/openpipelines-bio/openpipeline"
git_tag: "0.2.0-2059-ge92e56b4"
package_config:
name: "openpipeline"
version: "v3.0.0"
version: "v4.0.0"
summary: "Best-practice workflows for single-cell multi-omics analyses.\n"
description: "OpenPipelines are extensible single cell analysis pipelines for reproducible\
\ and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\
@@ -237,7 +238,7 @@ package_config:
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v3.0.0'"
- ".engines[.type == 'docker'].target_tag := 'v4.0.0'"
keywords:
- "single-cell"
- "multimodal"

View File

@@ -1,4 +1,4 @@
// merge v3.0.0
// merge v4.0.0
//
// This wrapper script is auto-generated by viash 0.9.4 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
@@ -3035,7 +3035,7 @@ meta = [
"config": processConfig(readJsonBlob('''{
"name" : "merge",
"namespace" : "dataflow",
"version" : "v3.0.0",
"version" : "v4.0.0",
"authors" : [
{
"name" : "Dries Schaumont",
@@ -3246,7 +3246,7 @@ meta = [
"id" : "docker",
"image" : "python:3.12-slim",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v3.0.0",
"target_tag" : "v4.0.0",
"namespace_separator" : "/",
"setup" : [
{
@@ -3260,11 +3260,12 @@ meta = [
"type" : "python",
"user" : false,
"packages" : [
"anndata~=0.11.1",
"mudata~=0.3.1"
"anndata~=0.12.7",
"awkward",
"mudata~=0.3.2"
],
"script" : [
"exec(\\"try:\\\\n import awkward\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: exit(1)\\")"
"exec(\\"try:\\\\n import zarr; from importlib.metadata import version\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: assert int(version(\\\\\\"zarr\\\\\\").partition(\\\\\\".\\\\\\")[0]) > 2\\")"
],
"upgrade" : true
}
@@ -3301,13 +3302,12 @@ meta = [
"engine" : "docker|native",
"output" : "/workdir/root/repo/target/nextflow/dataflow/merge",
"viash_version" : "0.9.4",
"git_commit" : "e92e56b49125af8ef2ebb11586191a6cbf9a8457",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline",
"git_tag" : "0.2.0-2059-ge92e56b4"
"git_commit" : "de02293c9e13198622b988dac952b2c8c70a1e35",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline"
},
"package_config" : {
"name" : "openpipeline",
"version" : "v3.0.0",
"version" : "v4.0.0",
"summary" : "Best-practice workflows for single-cell multi-omics analyses.\n",
"description" : "OpenPipelines are extensible single cell analysis pipelines for reproducible and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\nIn terms of workflows, the following has been made available, but keep in mind that\nindividual tools and functionality can be executed as standalone components as well.\n\n * Demultiplexing: conversion of raw sequencing data to FASTQ objects.\n * Ingestion: Read mapping and generating a count matrix.\n * Single sample processing: cell filtering and doublet detection.\n * Multisample processing: Count transformation, normalization, QC metric calulations.\n * Integration: Clustering, integration and batch correction using single and multimodal methods.\n * Downstream analysis workflows\n",
"info" : {
@@ -3332,7 +3332,7 @@ meta = [
".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n.runners[.type == 'nextflow'].config.script := 'includeConfig(\\"nextflow_labels.config\\")'",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v3.0.0'"
".engines[.type == 'docker'].target_tag := 'v4.0.0'"
],
"keywords" : [
"single-cell",
@@ -3847,7 +3847,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/openpipeline/dataflow/merge",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
},
"label" : [
"singlecpu",

View File

@@ -2,7 +2,7 @@ manifest {
name = 'dataflow/merge'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v3.0.0'
version = 'v4.0.0'
description = 'Combine one or more single-modality .h5mu files together into one .h5mu file.\n'
author = 'Dries Schaumont'
}

View File

@@ -1,6 +1,6 @@
name: "split_h5mu"
namespace: "dataflow"
version: "v3.0.0"
version: "v4.0.0"
authors:
- name: "Dorien Roosen"
roles:
@@ -203,7 +203,7 @@ engines:
id: "docker"
image: "python:3.12-slim"
target_registry: "images.viash-hub.com"
target_tag: "v3.0.0"
target_tag: "v4.0.0"
namespace_separator: "/"
setup:
- type: "apt"
@@ -213,11 +213,13 @@ engines:
- type: "python"
user: false
packages:
- "anndata~=0.11.1"
- "mudata~=0.3.1"
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
script:
- "exec(\"try:\\n import awkward\\nexcept ModuleNotFoundError:\\n exit(0)\\\
nelse: exit(1)\")"
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
test_setup:
- type: "apt"
@@ -242,12 +244,11 @@ build_info:
output: "target/nextflow/dataflow/split_h5mu"
executable: "target/nextflow/dataflow/split_h5mu/main.nf"
viash_version: "0.9.4"
git_commit: "e92e56b49125af8ef2ebb11586191a6cbf9a8457"
git_commit: "de02293c9e13198622b988dac952b2c8c70a1e35"
git_remote: "https://github.com/openpipelines-bio/openpipeline"
git_tag: "0.2.0-2059-ge92e56b4"
package_config:
name: "openpipeline"
version: "v3.0.0"
version: "v4.0.0"
summary: "Best-practice workflows for single-cell multi-omics analyses.\n"
description: "OpenPipelines are extensible single cell analysis pipelines for reproducible\
\ and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\
@@ -277,7 +278,7 @@ package_config:
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v3.0.0'"
- ".engines[.type == 'docker'].target_tag := 'v4.0.0'"
keywords:
- "single-cell"
- "multimodal"

View File

@@ -1,4 +1,4 @@
// split_h5mu v3.0.0
// split_h5mu v4.0.0
//
// This wrapper script is auto-generated by viash 0.9.4 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
@@ -3035,7 +3035,7 @@ meta = [
"config": processConfig(readJsonBlob('''{
"name" : "split_h5mu",
"namespace" : "dataflow",
"version" : "v3.0.0",
"version" : "v4.0.0",
"authors" : [
{
"name" : "Dorien Roosen",
@@ -3285,7 +3285,7 @@ meta = [
"id" : "docker",
"image" : "python:3.12-slim",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v3.0.0",
"target_tag" : "v4.0.0",
"namespace_separator" : "/",
"setup" : [
{
@@ -3299,11 +3299,12 @@ meta = [
"type" : "python",
"user" : false,
"packages" : [
"anndata~=0.11.1",
"mudata~=0.3.1"
"anndata~=0.12.7",
"awkward",
"mudata~=0.3.2"
],
"script" : [
"exec(\\"try:\\\\n import awkward\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: exit(1)\\")"
"exec(\\"try:\\\\n import zarr; from importlib.metadata import version\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: assert int(version(\\\\\\"zarr\\\\\\").partition(\\\\\\".\\\\\\")[0]) > 2\\")"
],
"upgrade" : true
}
@@ -3340,13 +3341,12 @@ meta = [
"engine" : "docker|native",
"output" : "/workdir/root/repo/target/nextflow/dataflow/split_h5mu",
"viash_version" : "0.9.4",
"git_commit" : "e92e56b49125af8ef2ebb11586191a6cbf9a8457",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline",
"git_tag" : "0.2.0-2059-ge92e56b4"
"git_commit" : "de02293c9e13198622b988dac952b2c8c70a1e35",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline"
},
"package_config" : {
"name" : "openpipeline",
"version" : "v3.0.0",
"version" : "v4.0.0",
"summary" : "Best-practice workflows for single-cell multi-omics analyses.\n",
"description" : "OpenPipelines are extensible single cell analysis pipelines for reproducible and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\nIn terms of workflows, the following has been made available, but keep in mind that\nindividual tools and functionality can be executed as standalone components as well.\n\n * Demultiplexing: conversion of raw sequencing data to FASTQ objects.\n * Ingestion: Read mapping and generating a count matrix.\n * Single sample processing: cell filtering and doublet detection.\n * Multisample processing: Count transformation, normalization, QC metric calulations.\n * Integration: Clustering, integration and batch correction using single and multimodal methods.\n * Downstream analysis workflows\n",
"info" : {
@@ -3371,7 +3371,7 @@ meta = [
".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n.runners[.type == 'nextflow'].config.script := 'includeConfig(\\"nextflow_labels.config\\")'",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v3.0.0'"
".engines[.type == 'docker'].target_tag := 'v4.0.0'"
],
"keywords" : [
"single-cell",
@@ -3908,7 +3908,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/openpipeline/dataflow/split_h5mu",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
},
"label" : [
"lowcpu",

View File

@@ -2,7 +2,7 @@ manifest {
name = 'dataflow/split_h5mu'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v3.0.0'
version = 'v4.0.0'
description = 'Split the samples of a single modality from a .h5mu (multimodal) sample into seperate .h5mu files based on the values of an .obs column of this modality. \n'
author = 'Dorien Roosen'
}

View File

@@ -1,6 +1,6 @@
name: "split_modalities"
namespace: "dataflow"
version: "v3.0.0"
version: "v4.0.0"
authors:
- name: "Dries Schaumont"
roles:
@@ -190,7 +190,7 @@ engines:
id: "docker"
image: "python:3.12-slim"
target_registry: "images.viash-hub.com"
target_tag: "v3.0.0"
target_tag: "v4.0.0"
namespace_separator: "/"
setup:
- type: "apt"
@@ -200,11 +200,13 @@ engines:
- type: "python"
user: false
packages:
- "anndata~=0.11.1"
- "mudata~=0.3.1"
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
script:
- "exec(\"try:\\n import awkward\\nexcept ModuleNotFoundError:\\n exit(0)\\\
nelse: exit(1)\")"
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
test_setup:
- type: "apt"
@@ -229,12 +231,11 @@ build_info:
output: "target/nextflow/dataflow/split_modalities"
executable: "target/nextflow/dataflow/split_modalities/main.nf"
viash_version: "0.9.4"
git_commit: "e92e56b49125af8ef2ebb11586191a6cbf9a8457"
git_commit: "de02293c9e13198622b988dac952b2c8c70a1e35"
git_remote: "https://github.com/openpipelines-bio/openpipeline"
git_tag: "0.2.0-2059-ge92e56b4"
package_config:
name: "openpipeline"
version: "v3.0.0"
version: "v4.0.0"
summary: "Best-practice workflows for single-cell multi-omics analyses.\n"
description: "OpenPipelines are extensible single cell analysis pipelines for reproducible\
\ and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\
@@ -264,7 +265,7 @@ package_config:
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v3.0.0'"
- ".engines[.type == 'docker'].target_tag := 'v4.0.0'"
keywords:
- "single-cell"
- "multimodal"

View File

@@ -1,4 +1,4 @@
// split_modalities v3.0.0
// split_modalities v4.0.0
//
// This wrapper script is auto-generated by viash 0.9.4 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
@@ -3036,7 +3036,7 @@ meta = [
"config": processConfig(readJsonBlob('''{
"name" : "split_modalities",
"namespace" : "dataflow",
"version" : "v3.0.0",
"version" : "v4.0.0",
"authors" : [
{
"name" : "Dries Schaumont",
@@ -3280,7 +3280,7 @@ meta = [
"id" : "docker",
"image" : "python:3.12-slim",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v3.0.0",
"target_tag" : "v4.0.0",
"namespace_separator" : "/",
"setup" : [
{
@@ -3294,11 +3294,12 @@ meta = [
"type" : "python",
"user" : false,
"packages" : [
"anndata~=0.11.1",
"mudata~=0.3.1"
"anndata~=0.12.7",
"awkward",
"mudata~=0.3.2"
],
"script" : [
"exec(\\"try:\\\\n import awkward\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: exit(1)\\")"
"exec(\\"try:\\\\n import zarr; from importlib.metadata import version\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: assert int(version(\\\\\\"zarr\\\\\\").partition(\\\\\\".\\\\\\")[0]) > 2\\")"
],
"upgrade" : true
}
@@ -3335,13 +3336,12 @@ meta = [
"engine" : "docker|native",
"output" : "/workdir/root/repo/target/nextflow/dataflow/split_modalities",
"viash_version" : "0.9.4",
"git_commit" : "e92e56b49125af8ef2ebb11586191a6cbf9a8457",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline",
"git_tag" : "0.2.0-2059-ge92e56b4"
"git_commit" : "de02293c9e13198622b988dac952b2c8c70a1e35",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline"
},
"package_config" : {
"name" : "openpipeline",
"version" : "v3.0.0",
"version" : "v4.0.0",
"summary" : "Best-practice workflows for single-cell multi-omics analyses.\n",
"description" : "OpenPipelines are extensible single cell analysis pipelines for reproducible and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\nIn terms of workflows, the following has been made available, but keep in mind that\nindividual tools and functionality can be executed as standalone components as well.\n\n * Demultiplexing: conversion of raw sequencing data to FASTQ objects.\n * Ingestion: Read mapping and generating a count matrix.\n * Single sample processing: cell filtering and doublet detection.\n * Multisample processing: Count transformation, normalization, QC metric calulations.\n * Integration: Clustering, integration and batch correction using single and multimodal methods.\n * Downstream analysis workflows\n",
"info" : {
@@ -3366,7 +3366,7 @@ meta = [
".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n.runners[.type == 'nextflow'].config.script := 'includeConfig(\\"nextflow_labels.config\\")'",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v3.0.0'"
".engines[.type == 'docker'].target_tag := 'v4.0.0'"
],
"keywords" : [
"single-cell",
@@ -3865,7 +3865,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/openpipeline/dataflow/split_modalities",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
},
"label" : [
"singlecpu",

View File

@@ -2,7 +2,7 @@ manifest {
name = 'dataflow/split_modalities'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v3.0.0'
version = 'v4.0.0'
description = 'Split the modalities from a single .h5mu multimodal sample into seperate .h5mu files. \n'
author = 'Dries Schaumont, Robrecht Cannoodt'
}

View File

@@ -1,6 +1,6 @@
name: "pca"
namespace: "dimred"
version: "v3.0.0"
version: "v4.0.0"
authors:
- name: "Dries De Maeyer"
roles:
@@ -240,7 +240,7 @@ engines:
id: "docker"
image: "python:3.12-slim"
target_registry: "images.viash-hub.com"
target_tag: "v3.0.0"
target_tag: "v4.0.0"
namespace_separator: "/"
setup:
- type: "apt"
@@ -250,12 +250,14 @@ engines:
- type: "python"
user: false
packages:
- "anndata~=0.11.1"
- "mudata~=0.3.1"
- "scanpy~=1.10.4"
- "anndata~=0.12.7"
- "awkward"
- "mudata~=0.3.2"
- "scanpy~=1.11.4"
script:
- "exec(\"try:\\n import awkward\\nexcept ModuleNotFoundError:\\n exit(0)\\\
nelse: exit(1)\")"
- "exec(\"try:\\n import zarr; from importlib.metadata import version\\nexcept\
\ ModuleNotFoundError:\\n exit(0)\\nelse: assert int(version(\\\"zarr\\\"\
).partition(\\\".\\\")[0]) > 2\")"
upgrade: true
test_setup:
- type: "python"
@@ -274,12 +276,11 @@ build_info:
output: "target/nextflow/dimred/pca"
executable: "target/nextflow/dimred/pca/main.nf"
viash_version: "0.9.4"
git_commit: "e92e56b49125af8ef2ebb11586191a6cbf9a8457"
git_commit: "de02293c9e13198622b988dac952b2c8c70a1e35"
git_remote: "https://github.com/openpipelines-bio/openpipeline"
git_tag: "0.2.0-2059-ge92e56b4"
package_config:
name: "openpipeline"
version: "v3.0.0"
version: "v4.0.0"
summary: "Best-practice workflows for single-cell multi-omics analyses.\n"
description: "OpenPipelines are extensible single cell analysis pipelines for reproducible\
\ and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\
@@ -309,7 +310,7 @@ package_config:
)'"
- ".engines += { type: \"native\" }"
- ".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'"
- ".engines[.type == 'docker'].target_tag := 'v3.0.0'"
- ".engines[.type == 'docker'].target_tag := 'v4.0.0'"
keywords:
- "single-cell"
- "multimodal"

View File

@@ -1,4 +1,4 @@
// pca v3.0.0
// pca v4.0.0
//
// This wrapper script is auto-generated by viash 0.9.4 and is thus a derivative
// work thereof. This software comes with ABSOLUTELY NO WARRANTY from Data
@@ -3035,7 +3035,7 @@ meta = [
"config": processConfig(readJsonBlob('''{
"name" : "pca",
"namespace" : "dimred",
"version" : "v3.0.0",
"version" : "v4.0.0",
"authors" : [
{
"name" : "Dries De Maeyer",
@@ -3333,7 +3333,7 @@ meta = [
"id" : "docker",
"image" : "python:3.12-slim",
"target_registry" : "images.viash-hub.com",
"target_tag" : "v3.0.0",
"target_tag" : "v4.0.0",
"namespace_separator" : "/",
"setup" : [
{
@@ -3347,12 +3347,13 @@ meta = [
"type" : "python",
"user" : false,
"packages" : [
"anndata~=0.11.1",
"mudata~=0.3.1",
"scanpy~=1.10.4"
"anndata~=0.12.7",
"awkward",
"mudata~=0.3.2",
"scanpy~=1.11.4"
],
"script" : [
"exec(\\"try:\\\\n import awkward\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: exit(1)\\")"
"exec(\\"try:\\\\n import zarr; from importlib.metadata import version\\\\nexcept ModuleNotFoundError:\\\\n exit(0)\\\\nelse: assert int(version(\\\\\\"zarr\\\\\\").partition(\\\\\\".\\\\\\")[0]) > 2\\")"
],
"upgrade" : true
}
@@ -3379,13 +3380,12 @@ meta = [
"engine" : "docker|native",
"output" : "/workdir/root/repo/target/nextflow/dimred/pca",
"viash_version" : "0.9.4",
"git_commit" : "e92e56b49125af8ef2ebb11586191a6cbf9a8457",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline",
"git_tag" : "0.2.0-2059-ge92e56b4"
"git_commit" : "de02293c9e13198622b988dac952b2c8c70a1e35",
"git_remote" : "https://github.com/openpipelines-bio/openpipeline"
},
"package_config" : {
"name" : "openpipeline",
"version" : "v3.0.0",
"version" : "v4.0.0",
"summary" : "Best-practice workflows for single-cell multi-omics analyses.\n",
"description" : "OpenPipelines are extensible single cell analysis pipelines for reproducible and large-scale single cell processing using [Viash](https://viash.io) and [Nextflow](https://www.nextflow.io/).\n\nIn terms of workflows, the following has been made available, but keep in mind that\nindividual tools and functionality can be executed as standalone components as well.\n\n * Demultiplexing: conversion of raw sequencing data to FASTQ objects.\n * Ingestion: Read mapping and generating a count matrix.\n * Single sample processing: cell filtering and doublet detection.\n * Multisample processing: Count transformation, normalization, QC metric calulations.\n * Integration: Clustering, integration and batch correction using single and multimodal methods.\n * Downstream analysis workflows\n",
"info" : {
@@ -3410,7 +3410,7 @@ meta = [
".resources += {path: '/src/workflows/utils/labels.config', dest: 'nextflow_labels.config'}\n.runners[.type == 'nextflow'].config.script := 'includeConfig(\\"nextflow_labels.config\\")'",
".engines += { type: \\"native\\" }",
".engines[.type == 'docker'].target_registry := 'images.viash-hub.com'",
".engines[.type == 'docker'].target_tag := 'v3.0.0'"
".engines[.type == 'docker'].target_tag := 'v4.0.0'"
],
"keywords" : [
"single-cell",
@@ -3933,7 +3933,7 @@ meta["defaults"] = [
"container" : {
"registry" : "images.viash-hub.com",
"image" : "vsh/openpipeline/dimred/pca",
"tag" : "v3.0.0"
"tag" : "v4.0.0"
},
"label" : [
"highcpu",

View File

@@ -2,7 +2,7 @@ manifest {
name = 'dimred/pca'
mainScript = 'main.nf'
nextflowVersion = '!>=20.12.1-edge'
version = 'v3.0.0'
version = 'v4.0.0'
description = 'Computes PCA coordinates, loadings and variance decomposition. Uses the implementation of scikit-learn [Pedregosa11].\n'
author = 'Dries De Maeyer'
}

Some files were not shown because too many files have changed in this diff Show More