vsh/biobox

Files

CI 04a5851ff8 Build branch biobox/main with version main to biobox on branch main (7158daa)

Build pipeline: viash-hub.biobox.main-tb4cv

Source commit: 7158daa5f6

Source message: Fix bases2fastq component, update to latest practices (#190)

* wip updates

* refactor component

* assume bases2fastq follows semver

* fix version command

* add entry to changelog

* move to minor changes

2025-09-01 11:04:56 +00:00

15 KiB

Raw Blame History

Testing Guide

This guide covers best practices for writing comprehensive test scripts for biobox components.

📌 Important: All new test scripts should use the centralized test helpers located at src/_utils/test_helpers.sh. This eliminates code duplication and ensures consistency across all components.

Core Principles
Test Script Structure
Centralized Test Helpers
Test Scenarios
Best Practices
Viash Testing Features
Static Test Data

Core Principles

1. Generate Test Data in Scripts

Preferred approach: Generate test data within the test script using the centralized helper functions.

# Generate test data using centralized helpers
create_test_fasta "$meta_temp_dir/input.fasta" 3 50
create_test_fastq "$meta_temp_dir/reads.fastq" 10 35

Avoid:

Storing static test files in the repository
Fetching test data from external sources
Large test datasets

2. Self-Contained Tests

Tests should be completely self-contained and not depend on external resources:

test_resources:
  - type: bash_script
    path: test.sh
  - type: file
    path: /src/_utils/test_helpers.sh

Only add static test files if absolutely necessary:

test_resources:
  - type: bash_script
    path: test.sh
  - type: file
    path: /src/_utils/test_helpers.sh
  - type: file
    path: test_data  # Only if data generation is impractical

Test Script Structure

Configuration Setup

Add the test helpers as a resource in your component configuration:

test_resources:
  - type: bash_script
    path: test.sh
  - type: file
    path: /src/_utils/test_helpers.sh

Basic Test Template

#!/bin/bash

## VIASH START
## VIASH END

# Source the centralized test helpers
source "$meta_resources_dir/test_helpers.sh"

# Initialize test environment with strict error handling
setup_test_env

#############################################
# Test execution with centralized functions
#############################################

log "Starting tests for $meta_name"

# --- Test Case 1: Basic functionality ---
log "Starting TEST 1: Basic functionality"

# Create and validate test data
test_data_dir="$meta_temp_dir/test_data"
mkdir -p "$test_data_dir"
create_test_fasta "$test_data_dir/input.fasta" 3 50
check_file_exists "$test_data_dir/input.fasta" "input FASTA file"

log "Executing $meta_name with basic parameters..."
"$meta_executable" \
  --input "$test_data_dir/input.fasta" \
  --output "$meta_temp_dir/test1"

log "Validating TEST 1 outputs..."
check_dir_exists "$meta_temp_dir/test1" "output directory"
check_file_exists "$meta_temp_dir/test1/result.txt" "result file"
check_file_not_empty "$meta_temp_dir/test1/result.txt" "result file"

log "✅ TEST 1 completed successfully"

# --- Test Case 2: Advanced parameters ---
log "Starting TEST 2: Advanced parameters"

# Create different test data
create_test_fastq "$test_data_dir/input.fastq" 10 35
check_file_exists "$test_data_dir/input.fastq" "input FASTQ file"

log "Executing $meta_name with advanced parameters..."
"$meta_executable" \
  --input "$test_data_dir/input.fastq" \
  --output "$meta_temp_dir/test2" \
  --threads 2 \
  --verbose

log "Validating TEST 2 outputs..."
check_file_exists "$meta_temp_dir/test2/advanced_result.txt" "advanced result file"
check_file_contains "$meta_temp_dir/test2/advanced_result.txt" "expected_pattern" "advanced result file"

log "✅ TEST 2 completed successfully"

print_test_summary "All tests completed successfully"

Centralized Test Helpers

The centralized test helpers located at src/_utils/test_helpers.sh provide comprehensive testing functionality to ensure consistency across all biobox components.

Available Functions

Logging Functions

log "message" - Log with timestamp
log_warn "message" - Warning message
log_error "message" - Error message

File/Directory Validation

check_file_exists path "description" - Verify file exists
check_dir_exists path "description" - Verify directory exists
check_file_not_exists path "description" - Verify file doesn't exist
check_dir_not_exists path "description" - Verify directory doesn't exist
check_file_empty path "description" - Verify file is empty
check_file_not_empty path "description" - Verify file is not empty

Content Validation

check_file_contains path "text" "description" - Verify file contains text
check_file_not_contains path "text" "description" - Verify file doesn't contain text
check_file_matches_regex path "pattern" "description" - Verify file matches regex
check_file_line_count path count "description" - Verify line count

Test Data Generation

create_test_fasta path [num_seqs] [seq_length] - Generate FASTA file
create_test_fastq path [num_reads] [read_length] - Generate FASTQ file
create_test_gtf path [num_genes] - Generate GTF file
create_test_gff path [num_features] - Generate GFF file
create_test_bed path [num_intervals] - Generate BED file
create_test_csv path [num_rows] - Generate CSV file
create_test_tsv path [num_rows] - Generate TSV file

Utility Functions

setup_test_env - Initialize test environment with strict error handling
print_test_summary "test_name" - Print completion message

Usage Example

#!/bin/bash

## VIASH START
## VIASH END

# Source centralized helpers
source "$meta_resources_dir/test_helpers.sh"
setup_test_env

log "Starting tests for $meta_name"

# Generate test data
create_test_fasta "$meta_temp_dir/input.fasta" 3 50
check_file_exists "$meta_temp_dir/input.fasta" "input FASTA file"

# Run component
"$meta_executable" \
  --input "$meta_temp_dir/input.fasta" \
  --output "$meta_temp_dir/output.txt"

# Validate output
check_file_exists "$meta_temp_dir/output.txt" "result file"
check_file_contains "$meta_temp_dir/output.txt" "expected_pattern" "result file"

print_test_summary "Basic functionality test"

Test Scenarios

1. Basic Functionality

Test the component with minimal, essential parameters:

log "Starting TEST 1: Basic functionality"

create_test_fasta "$meta_temp_dir/input.fasta" 3 50

"$meta_executable" \
  --input "$meta_temp_dir/input.fasta" \
  --output "$meta_temp_dir/output.txt"

check_file_exists "$meta_temp_dir/output.txt" "output file"
check_file_not_empty "$meta_temp_dir/output.txt" "output file"

log "✅ TEST 1 completed successfully"

2. Multiple Input Files

Test with multiple input files or complex input scenarios:

log "Starting TEST 2: Multiple input files"

create_test_fasta "$meta_temp_dir/input1.fasta" 2 30
create_test_fasta "$meta_temp_dir/input2.fasta" 2 30

"$meta_executable" \
  --input "$meta_temp_dir/input1.fasta;$meta_temp_dir/input2.fasta" \
  --output "$meta_temp_dir/output.txt"

check_file_exists "$meta_temp_dir/output.txt" "merged output file"

log "✅ TEST 2 completed successfully"

3. Optional Parameters

Test with optional parameters and advanced features:

log "Starting TEST 3: Optional parameters"

create_test_fastq "$meta_temp_dir/input.fastq" 10 35

"$meta_executable" \
  --input "$meta_temp_dir/input.fastq" \
  --output "$meta_temp_dir/output.txt" \
  --threads 2 \
  --verbose

check_file_exists "$meta_temp_dir/output.txt" "output file with options"
check_file_contains "$meta_temp_dir/output.txt" "verbose" "verbose output"

log "✅ TEST 3 completed successfully"

4. Edge Cases

Test with edge cases like empty files or unusual inputs:

log "Starting TEST 4: Edge case - empty input"

# Create empty input file
touch "$meta_temp_dir/empty.fasta"

# Test should handle empty input gracefully
if "$meta_executable" \
  --input "$meta_temp_dir/empty.fasta" \
  --output "$meta_temp_dir/output.txt" 2>/dev/null; then
  log_warn "Component succeeded with empty input - checking output"
  check_file_exists "$meta_temp_dir/output.txt" "output file for empty input"
else
  log "Expected behavior: Component properly rejected empty input"
fi

log "✅ TEST 4 completed successfully"

5. Error Handling

Test proper error handling for invalid inputs:

log "Starting TEST 5: Error handling"

# Test with non-existent input file
if "$meta_executable" \
  --input "/non/existent/file.txt" \
  --output "$meta_temp_dir/output.txt" 2>/dev/null; then
  log_error "Component should have failed with non-existent input"
  exit 1
else
  log "✅ Component properly handled non-existent input file"
fi

log "✅ TEST 5 completed successfully"

Best Practices

1. Use Centralized Test Helpers

Always use the centralized test helpers instead of defining functions individually:

# ✅ Recommended: Use centralized helpers
source "$meta_resources_dir/test_helpers.sh"
setup_test_env

# ❌ NOT recommended: Defining functions individually
set -euo pipefail
log() { echo "$(date '+%Y-%m-%d %H:%M:%S') [TEST] $*"; }

2. Strict Error Handling

The centralized helpers automatically provide strict error handling via setup_test_env:

# Automatically enabled by setup_test_env:
set -euo pipefail  # Exit on errors, undefined variables, pipe failures
export LC_ALL=C    # Consistent locale for reproducible results

3. Descriptive Validation

Use descriptive validation functions with meaningful descriptions:

# ✅ Good: Descriptive validation
check_file_exists "$output_file" "filtered feature matrix"
check_file_not_exists "$bam_file" "BAM file (should be disabled by default)"
check_file_contains "$result_file" "expected_pattern" "analysis results"

# ❌ Less helpful: Basic validation without context
check_file_exists "$output_file"

4. Organized Structure

Use $meta_temp_dir and create organized test structure:

# Create organized test structure
test_data_dir="$meta_temp_dir/test_data"
test_output_dir="$meta_temp_dir/test_output"
mkdir -p "$test_data_dir" "$test_output_dir"

create_test_fasta "$test_data_dir/input.fasta" 3 50

5. Clear Test Output

Use consistent logging with clear test boundaries:

log "Starting TEST 1: Basic functionality"
log "Executing $meta_name with basic parameters..."
log "Validating TEST 1 outputs..."
log "✅ TEST 1 completed successfully"

# Final summary
print_test_summary "All tests completed successfully"

6. Comprehensive Content Validation

Don't just check that files exist - validate their content:

# Check existence and content
check_file_exists "$meta_temp_dir/output.txt" "analysis results"
check_file_not_empty "$meta_temp_dir/output.txt" "analysis results"
check_file_contains "$meta_temp_dir/output.txt" "Number of sequences" "result summary"
check_file_line_count "$meta_temp_dir/output.txt" 10 "expected number of results"

7. Multiple Test Scenarios

Include comprehensive test coverage:

# Test 1: Basic functionality
log "Starting TEST 1: Basic functionality"
# ... test implementation ...
log "✅ TEST 1 completed successfully"

# Test 2: Advanced options
log "Starting TEST 2: Advanced options"
# ... test implementation ...
log "✅ TEST 2 completed successfully"

# Test 3: Edge cases
log "Starting TEST 3: Edge case handling"
# ... test implementation ...
log "✅ TEST 3 completed successfully"

print_test_summary "All tests completed successfully"

Viash Testing Features

Running Tests

# Test a single component
viash test config.vsh.yaml

# Test with specific resources
viash test config.vsh.yaml --cpus 4 --memory 8GB

# Test with specific setup strategy
viash test config.vsh.yaml --setup build --verbose

# Keep temporary files for debugging
viash test config.vsh.yaml --keep true

# Test all components in parallel
viash ns test --parallel

# Test specific namespace
viash ns test -q alignment --parallel

Test Execution Flow

When running viash test, Viash automatically:

Creates temporary directory (available as $meta_temp_dir)
Builds the main executable
Builds/pulls Docker image (if using Docker engine)
Iterates over all test scripts in test_resources
Builds each test into executable and runs it
Cleans up temporary files (unless --keep true)
Returns exit code 0 if all tests succeed

Meta Variables in Tests

Your test scripts automatically have access to important meta variables:

$meta_executable - Path to the built component executable
$meta_temp_dir - Temporary directory for test files (automatically cleaned up)
$meta_name - Component name for logging
$meta_resources_dir - Path to test resources

Multiple Test Scripts

You can add multiple test scripts to cover different scenarios:

test_resources:
  - type: bash_script
    path: test_basic.sh
  - type: bash_script
    path: test_edge_cases.sh
  - type: bash_script
    path: test_large_data.sh
  - type: file
    path: /src/_utils/test_helpers.sh

Advanced Testing Options

# Test with different container setup strategies
viash test config.vsh.yaml --setup cachedbuild  # Use cached layers (faster)
viash test config.vsh.yaml --setup build        # Clean build from scratch
viash test config.vsh.yaml --setup alwaysbuild  # Always rebuild container

# Test with configuration modifications
viash test config.vsh.yaml -c '.engines[0].image = "ubuntu:22.04"'

# Test with debug mode for troubleshooting
viash test config.vsh.yaml --keep true --verbose

For more details, see the Viash Unit Testing Documentation.

Static Test Data

When to Use Static Test Data

Only use static test files when:

The tool requires very specific, complex file formats that are difficult to generate
Generating equivalent test data is impractical or overly complex
You need real-world data to validate complex algorithms
Test data is very small (<1KB preferred, <10KB maximum)

Guidelines for Static Test Data

If you must use static test data:

Keep files small - Prefer <1KB, maximum <10KB
Document the source - How was it created?
Use minimal examples - Strip down to essential features
Consider alternatives - Can you generate equivalent data?

# test_data/README.md
# Test data for complex_tool component
# Source: https://github.com/example/dataset
# Generated with: tool --export-sample --format minimal
# Date: 2025-01-01
# Size: 847 bytes
# Purpose: Tests complex file format parsing

Referencing Static Test Data

test_resources:
  - type: bash_script
    path: test.sh
  - type: file
    path: /src/_utils/test_helpers.sh
  - type: file
    path: test_data

# In your test script
static_data="$meta_resources_dir/test_data/sample.complex"
check_file_exists "$static_data" "static test data"

"$meta_executable" --input "$static_data" --output "$meta_temp_dir/output.txt"

15 KiB Raw Blame History