Build pipeline: viash-hub.biobox.v0.4.x-58lg9
Source commit: 736f18e988
Source message: Merge remote-tracking branch 'origin/main' into v0.4.x
15 KiB
Testing Guide
This guide covers best practices for writing comprehensive test scripts for biobox components.
📌 Important: All new test scripts should use the centralized test helpers located at
src/_utils/test_helpers.sh. This eliminates code duplication and ensures consistency across all components.
Table of Contents
- Core Principles
- Test Script Structure
- Centralized Test Helpers
- Test Scenarios
- Best Practices
- Viash Testing Features
- Static Test Data
Core Principles
1. Generate Test Data in Scripts
Preferred approach: Generate test data within the test script using the centralized helper functions.
# Generate test data using centralized helpers
create_test_fasta "$meta_temp_dir/input.fasta" 3 50
create_test_fastq "$meta_temp_dir/reads.fastq" 10 35
Avoid:
- Storing static test files in the repository
- Fetching test data from external sources
- Large test datasets
2. Self-Contained Tests
Tests should be completely self-contained and not depend on external resources:
test_resources:
- type: bash_script
path: test.sh
- type: file
path: /src/_utils/test_helpers.sh
Only add static test files if absolutely necessary:
test_resources:
- type: bash_script
path: test.sh
- type: file
path: /src/_utils/test_helpers.sh
- type: file
path: test_data # Only if data generation is impractical
Test Script Structure
Configuration Setup
Add the test helpers as a resource in your component configuration:
test_resources:
- type: bash_script
path: test.sh
- type: file
path: /src/_utils/test_helpers.sh
Basic Test Template
#!/bin/bash
## VIASH START
## VIASH END
# Source the centralized test helpers
source "$meta_resources_dir/test_helpers.sh"
# Initialize test environment with strict error handling
setup_test_env
#############################################
# Test execution with centralized functions
#############################################
log "Starting tests for $meta_name"
# --- Test Case 1: Basic functionality ---
log "Starting TEST 1: Basic functionality"
# Create and validate test data
test_data_dir="$meta_temp_dir/test_data"
mkdir -p "$test_data_dir"
create_test_fasta "$test_data_dir/input.fasta" 3 50
check_file_exists "$test_data_dir/input.fasta" "input FASTA file"
log "Executing $meta_name with basic parameters..."
"$meta_executable" \
--input "$test_data_dir/input.fasta" \
--output "$meta_temp_dir/test1"
log "Validating TEST 1 outputs..."
check_dir_exists "$meta_temp_dir/test1" "output directory"
check_file_exists "$meta_temp_dir/test1/result.txt" "result file"
check_file_not_empty "$meta_temp_dir/test1/result.txt" "result file"
log "✅ TEST 1 completed successfully"
# --- Test Case 2: Advanced parameters ---
log "Starting TEST 2: Advanced parameters"
# Create different test data
create_test_fastq "$test_data_dir/input.fastq" 10 35
check_file_exists "$test_data_dir/input.fastq" "input FASTQ file"
log "Executing $meta_name with advanced parameters..."
"$meta_executable" \
--input "$test_data_dir/input.fastq" \
--output "$meta_temp_dir/test2" \
--threads 2 \
--verbose
log "Validating TEST 2 outputs..."
check_file_exists "$meta_temp_dir/test2/advanced_result.txt" "advanced result file"
check_file_contains "$meta_temp_dir/test2/advanced_result.txt" "expected_pattern" "advanced result file"
log "✅ TEST 2 completed successfully"
print_test_summary "All tests completed successfully"
Centralized Test Helpers
The centralized test helpers located at src/_utils/test_helpers.sh provide comprehensive testing functionality to ensure consistency across all biobox components.
Available Functions
Logging Functions
log "message"- Log with timestamplog_warn "message"- Warning messagelog_error "message"- Error message
File/Directory Validation
check_file_exists path "description"- Verify file existscheck_dir_exists path "description"- Verify directory existscheck_file_not_exists path "description"- Verify file doesn't existcheck_dir_not_exists path "description"- Verify directory doesn't existcheck_file_empty path "description"- Verify file is emptycheck_file_not_empty path "description"- Verify file is not empty
Content Validation
check_file_contains path "text" "description"- Verify file contains textcheck_file_not_contains path "text" "description"- Verify file doesn't contain textcheck_file_matches_regex path "pattern" "description"- Verify file matches regexcheck_file_line_count path count "description"- Verify line count
Test Data Generation
create_test_fasta path [num_seqs] [seq_length]- Generate FASTA filecreate_test_fastq path [num_reads] [read_length]- Generate FASTQ filecreate_test_gtf path [num_genes]- Generate GTF filecreate_test_gff path [num_features]- Generate GFF filecreate_test_bed path [num_intervals]- Generate BED filecreate_test_csv path [num_rows]- Generate CSV filecreate_test_tsv path [num_rows]- Generate TSV file
Utility Functions
setup_test_env- Initialize test environment with strict error handlingprint_test_summary "test_name"- Print completion message
Usage Example
#!/bin/bash
## VIASH START
## VIASH END
# Source centralized helpers
source "$meta_resources_dir/test_helpers.sh"
setup_test_env
log "Starting tests for $meta_name"
# Generate test data
create_test_fasta "$meta_temp_dir/input.fasta" 3 50
check_file_exists "$meta_temp_dir/input.fasta" "input FASTA file"
# Run component
"$meta_executable" \
--input "$meta_temp_dir/input.fasta" \
--output "$meta_temp_dir/output.txt"
# Validate output
check_file_exists "$meta_temp_dir/output.txt" "result file"
check_file_contains "$meta_temp_dir/output.txt" "expected_pattern" "result file"
print_test_summary "Basic functionality test"
Test Scenarios
1. Basic Functionality
Test the component with minimal, essential parameters:
log "Starting TEST 1: Basic functionality"
create_test_fasta "$meta_temp_dir/input.fasta" 3 50
"$meta_executable" \
--input "$meta_temp_dir/input.fasta" \
--output "$meta_temp_dir/output.txt"
check_file_exists "$meta_temp_dir/output.txt" "output file"
check_file_not_empty "$meta_temp_dir/output.txt" "output file"
log "✅ TEST 1 completed successfully"
2. Multiple Input Files
Test with multiple input files or complex input scenarios:
log "Starting TEST 2: Multiple input files"
create_test_fasta "$meta_temp_dir/input1.fasta" 2 30
create_test_fasta "$meta_temp_dir/input2.fasta" 2 30
"$meta_executable" \
--input "$meta_temp_dir/input1.fasta;$meta_temp_dir/input2.fasta" \
--output "$meta_temp_dir/output.txt"
check_file_exists "$meta_temp_dir/output.txt" "merged output file"
log "✅ TEST 2 completed successfully"
3. Optional Parameters
Test with optional parameters and advanced features:
log "Starting TEST 3: Optional parameters"
create_test_fastq "$meta_temp_dir/input.fastq" 10 35
"$meta_executable" \
--input "$meta_temp_dir/input.fastq" \
--output "$meta_temp_dir/output.txt" \
--threads 2 \
--verbose
check_file_exists "$meta_temp_dir/output.txt" "output file with options"
check_file_contains "$meta_temp_dir/output.txt" "verbose" "verbose output"
log "✅ TEST 3 completed successfully"
4. Edge Cases
Test with edge cases like empty files or unusual inputs:
log "Starting TEST 4: Edge case - empty input"
# Create empty input file
touch "$meta_temp_dir/empty.fasta"
# Test should handle empty input gracefully
if "$meta_executable" \
--input "$meta_temp_dir/empty.fasta" \
--output "$meta_temp_dir/output.txt" 2>/dev/null; then
log_warn "Component succeeded with empty input - checking output"
check_file_exists "$meta_temp_dir/output.txt" "output file for empty input"
else
log "Expected behavior: Component properly rejected empty input"
fi
log "✅ TEST 4 completed successfully"
5. Error Handling
Test proper error handling for invalid inputs:
log "Starting TEST 5: Error handling"
# Test with non-existent input file
if "$meta_executable" \
--input "/non/existent/file.txt" \
--output "$meta_temp_dir/output.txt" 2>/dev/null; then
log_error "Component should have failed with non-existent input"
exit 1
else
log "✅ Component properly handled non-existent input file"
fi
log "✅ TEST 5 completed successfully"
Best Practices
1. Use Centralized Test Helpers
Always use the centralized test helpers instead of defining functions individually:
# ✅ Recommended: Use centralized helpers
source "$meta_resources_dir/test_helpers.sh"
setup_test_env
# ❌ NOT recommended: Defining functions individually
set -euo pipefail
log() { echo "$(date '+%Y-%m-%d %H:%M:%S') [TEST] $*"; }
2. Strict Error Handling
The centralized helpers automatically provide strict error handling via setup_test_env:
# Automatically enabled by setup_test_env:
set -euo pipefail # Exit on errors, undefined variables, pipe failures
export LC_ALL=C # Consistent locale for reproducible results
3. Descriptive Validation
Use descriptive validation functions with meaningful descriptions:
# ✅ Good: Descriptive validation
check_file_exists "$output_file" "filtered feature matrix"
check_file_not_exists "$bam_file" "BAM file (should be disabled by default)"
check_file_contains "$result_file" "expected_pattern" "analysis results"
# ❌ Less helpful: Basic validation without context
check_file_exists "$output_file"
4. Organized Structure
Use $meta_temp_dir and create organized test structure:
# Create organized test structure
test_data_dir="$meta_temp_dir/test_data"
test_output_dir="$meta_temp_dir/test_output"
mkdir -p "$test_data_dir" "$test_output_dir"
create_test_fasta "$test_data_dir/input.fasta" 3 50
5. Clear Test Output
Use consistent logging with clear test boundaries:
log "Starting TEST 1: Basic functionality"
log "Executing $meta_name with basic parameters..."
log "Validating TEST 1 outputs..."
log "✅ TEST 1 completed successfully"
# Final summary
print_test_summary "All tests completed successfully"
6. Comprehensive Content Validation
Don't just check that files exist - validate their content:
# Check existence and content
check_file_exists "$meta_temp_dir/output.txt" "analysis results"
check_file_not_empty "$meta_temp_dir/output.txt" "analysis results"
check_file_contains "$meta_temp_dir/output.txt" "Number of sequences" "result summary"
check_file_line_count "$meta_temp_dir/output.txt" 10 "expected number of results"
7. Multiple Test Scenarios
Include comprehensive test coverage:
# Test 1: Basic functionality
log "Starting TEST 1: Basic functionality"
# ... test implementation ...
log "✅ TEST 1 completed successfully"
# Test 2: Advanced options
log "Starting TEST 2: Advanced options"
# ... test implementation ...
log "✅ TEST 2 completed successfully"
# Test 3: Edge cases
log "Starting TEST 3: Edge case handling"
# ... test implementation ...
log "✅ TEST 3 completed successfully"
print_test_summary "All tests completed successfully"
Viash Testing Features
Running Tests
# Test a single component
viash test config.vsh.yaml
# Test with specific resources
viash test config.vsh.yaml --cpus 4 --memory 8GB
# Test with specific setup strategy
viash test config.vsh.yaml --setup build --verbose
# Keep temporary files for debugging
viash test config.vsh.yaml --keep true
# Test all components in parallel
viash ns test --parallel
# Test specific namespace
viash ns test -q alignment --parallel
Test Execution Flow
When running viash test, Viash automatically:
- Creates temporary directory (available as
$meta_temp_dir) - Builds the main executable
- Builds/pulls Docker image (if using Docker engine)
- Iterates over all test scripts in
test_resources - Builds each test into executable and runs it
- Cleans up temporary files (unless
--keep true) - Returns exit code 0 if all tests succeed
Meta Variables in Tests
Your test scripts automatically have access to important meta variables:
$meta_executable- Path to the built component executable$meta_temp_dir- Temporary directory for test files (automatically cleaned up)$meta_name- Component name for logging$meta_resources_dir- Path to test resources
Multiple Test Scripts
You can add multiple test scripts to cover different scenarios:
test_resources:
- type: bash_script
path: test_basic.sh
- type: bash_script
path: test_edge_cases.sh
- type: bash_script
path: test_large_data.sh
- type: file
path: /src/_utils/test_helpers.sh
Advanced Testing Options
# Test with different container setup strategies
viash test config.vsh.yaml --setup cachedbuild # Use cached layers (faster)
viash test config.vsh.yaml --setup build # Clean build from scratch
viash test config.vsh.yaml --setup alwaysbuild # Always rebuild container
# Test with configuration modifications
viash test config.vsh.yaml -c '.engines[0].image = "ubuntu:22.04"'
# Test with debug mode for troubleshooting
viash test config.vsh.yaml --keep true --verbose
For more details, see the Viash Unit Testing Documentation.
Static Test Data
When to Use Static Test Data
Only use static test files when:
- The tool requires very specific, complex file formats that are difficult to generate
- Generating equivalent test data is impractical or overly complex
- You need real-world data to validate complex algorithms
- Test data is very small (<1KB preferred, <10KB maximum)
Guidelines for Static Test Data
If you must use static test data:
- Keep files small - Prefer <1KB, maximum <10KB
- Document the source - How was it created?
- Use minimal examples - Strip down to essential features
- Consider alternatives - Can you generate equivalent data?
# test_data/README.md
# Test data for complex_tool component
# Source: https://github.com/example/dataset
# Generated with: tool --export-sample --format minimal
# Date: 2025-01-01
# Size: 847 bytes
# Purpose: Tests complex file format parsing
Referencing Static Test Data
test_resources:
- type: bash_script
path: test.sh
- type: file
path: /src/_utils/test_helpers.sh
- type: file
path: test_data
# In your test script
static_data="$meta_resources_dir/test_data/sample.complex"
check_file_exists "$static_data" "static test data"
"$meta_executable" --input "$static_data" --output "$meta_temp_dir/output.txt"