Files
biobox/CHANGELOG.md

288 lines
12 KiB
Markdown
Raw Normal View History

# biobox 0.3.2
## NEW FUNCTIONALITY
* `fq`:
- `fq/fq_lint`: Validate FASTQ files for common issues (PR #179).
- `fq/fq_subsample`: Sample a subset of records from single or paired FASTQ files (PR #179).
## MAJOR CHANGES
* `fq_subsample`: This component has been deprecated in favour of `fq/fq_subsample`, and will be removed in biobox 0.4.0 (PR #179).
## MINOR CHANGES
* Update README (PR #177).
* Add authors to package config and update author information (PR #180).
# biobox 0.3.1
## NEW FUNCTIONALITY
* `bcl_convert`: add `force` argument (PR #171).
* `cellranger/cellranger_count`: Align fastq files using Cell Ranger count (PR #163).
## MINOR CHANGES
* Replace the deprecated use of the meta variable `functionality_name` by just `name` (PR #174).
* Bump viash to `0.9.4` (PR #175).
## DOCUMENTATION
* Update README (PR #176).
# biobox 0.3.0
## NEW FUNCTIONALITY
* `agat`:
- `agat/agat_convert_genscan2gff`: convert a genscan file into a GFF file (PR #100).
- `agat/agat_sp_add_introns`: add intron features to gtf/gff file without intron features (PR #104).
- `agat/agat_sp_filter_feature_from_kill_list`: remove features in a GFF file based on a kill list (PR #105).
- `agat/agat_sp_merge_annotations`: merge different gff annotation files in one (PR #106).
- `agat/agat_sp_statistics`: provides exhaustive statistics of a gft/gff file (PR #107).
- `agat/agat_sq_stat_basic`: provide basic statistics of a gtf/gff file (PR #110).
* `bd_rhapsody/bd_rhapsody_sequence_analysis`: BD Rhapsody Sequence Analysis CWL pipeline (PR #96).
* `bedtools`:
- `bedtools/bedtools_bamtobed`: Converts BAM alignments to BED6 or BEDPE format (PR #109).
* `rsem/rsem_calculate_expression`: Calculate expression levels (PR #93).
* `cellranger`:
- `cellranger/cellranger_mkref`: Build a Cell Ranger-compatible reference folder from user-supplied genome FASTA and gene GTF files (PR #164).
* `rseqc`:
- `rseqc/rseqc_inner_distance`: Calculate inner distance between read pairs (PR #159).
- `rseqc/rseqc_inferexperiment`: Infer strandedness from sequencing reads (PR #158).
- `rseqc/bam_stat`: Generate statistics from a bam file (PR #155).
* `nanoplot`: Plotting tool for long read sequencing data and alignments (PR #95).
* `sgedemux`: demultiplexing sequencing data generated on Singular Genomics' sequencing instruments (PR #166).
* `bases2fasta`: demultiplexing sequencing data generated by Element Biosciences instruments (PR #167).
## BUG FIXES
* `falco`: Fix a typo in the `--reverse_complement` argument (PR #157).
* `cutadapt`: Fix the the non-functional `action` parameter (PR #161).
* `bbmap_bbsplit`: Change argument type of `build` to `file` and add output argument `index` (PR #162).
* `kallisto/kallisto_index`: Fix command script to use `--threads` option (PR #162).
* `kallisto/kallisto_quant`: Change type of argument `output_dir` to `file` and add output argument `log` (PR #162).
* `rsem/rsem_calculate_expression`: Fix output handling (PR #162).
* `sortmerna`: Change type pf argument `aligned` to `file`; update docker image; accept more than two reference files (PR #162).
* `umi_tools/umi_tools_extract`: Remove `umi_discard_reads` option and change `log2stderr` to input argument (PR #162).
* `star/star_genome_generate`: Fix passing of optional sjdb parameters (PR #170).
## MINOR CHANGES
* `agat_convert_bed2gff`: change type of argument `inflate_off` from `boolean_false` to `boolean_true` (PR #160).
* `cutadapt`: change type of argument `no_indels` and `no_match_adapter_wildcards` from `boolean_false` to `boolean_true` (PR #160).
* Upgrade to Viash 0.9.0.
* `bbmap_bbsplit`: Move to namespace `bbmap` (PR #162).
# biobox 0.2.0
## BREAKING CHANGES
* `star/star_align_reads`: Change all arguments from `--camelCase` to `--snake_case` (PR #62).
* `star/star_genome_generate`: Change all arguments from `--camelCase` to `--snake_case` (PR #62).
## NEW FUNCTIONALITY
* `star/star_align_reads`: Add star solo related arguments (PR #62).
* `bd_rhapsody/bd_rhapsody_make_reference`: Create a reference for the BD Rhapsody pipeline (PR #75).
* `umitools/umitools_dedup`: Deduplicate reads based on the mapping co-ordinate and the UMI attached to the read (PR #54).
* `seqtk`:
- `seqtk/seqtk_sample`: Subsamples sequences from FASTA/Q files (PR #68).
- `seqtk/seqtk_subseq`: Extract the sequences (complete or subsequence) from the FASTA/FASTQ files
based on a provided sequence IDs or region coordinates file (PR #85).
* `agat`:
- `agat_convert_sp_gff2gtf`: convert any GTF/GFF file into a proper GTF file (PR #76).
- `agat_convert_bed2gff`: convert bed file to gff format (PR #97).
- `agat_convert_embl2gff`: convert an EMBL file into GFF format (PR #99).
- `agat/agat_convert_sp_gff2gtf`: convert any GTF/GFF file into a proper GTF file (PR #76).
- `agat/agat_convert_bed2gff`: convert bed file to gff format (PR #97).
- `agat/agat_convert_mfannot2gff`: convert MFannot "masterfile" annotation to gff format (PR #112).
- `agat/agat_convert_embl2gff`: convert an EMBL file into GFF format (PR #99).
- `agat/agat_convert_sp_gff2tsv`: convert gtf/gff file into tabulated file (PR #102).
- `agat/agat_convert_sp_gxf2gxf`: fixes and/or standardizes any GTF/GFF file into full sorted GTF/GFF file (PR #103).
* `bedtools`:
- `bedtools/bedtools_intersect`: Allows one to screen for overlaps between two sets of genomic features (PR #94).
- `bedtools/bedtools_sort`: Sorts a feature file (bed/gff/vcf) by chromosome and other criteria (PR #98).
- `bedtools/bedtools_genomecov`: Compute the coverage of a feature file (bed/gff/vcf/bam) among a genome (PR #128).
- `bedtools/bedtools_groupby`: Summarizes a dataset column based upon common column groupings. Akin to the SQL "group by" command (PR #123).
- `bedtools/bedtools_merge`: Merges overlapping BED/GFF/VCF entries into a single interval (PR #118).
- `bedtools/bedtools_bamtofastq`: Convert BAM alignments to FASTQ files (PR #101).
- `bedtools/bedtools_bedtobam`: Converts genomic feature records (bed/gff/vcf) to BAM format (PR #111).
- `bedtools/bedtools_bed12tobed6`: Converts BED12 files to BED6 files (PR #140).
- `bedtools/bedtools_links`: Creates an HTML file with links to an instance of the UCSC Genome Browser for all features / intervals in a (bed/gff/vcf) file (PR #137).
Build branch main with version main (766ab6c) Build pipeline: viash-hub.biobox.main-lpdjj Source commit: https://github.com/viash-hub/biobox/commit/766ab6c9c3059004c7c3f205621909b2d8b0b26d Source message: Qualimap rnaseq (#74) * first version * complete script for qualimap * add escaping character before leading hashtag (#50) * add escaping character before leading hashtag * update changelog * Update CHANGELOG.md Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * replace escaping \ by \\ --------- Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Samtools collate (#49) * initial commit dedup * Revert "initial commit dedup" This reverts commit 38f586bec0ac9e4312b016e29c3aa0bd53f292b2. * Initial commit, whole component is functional * Update viash (#51) * update viash * update readme * update changelog * update changelog * fix incorrect heading detection * update again * clean up readme * Samtools view (#48) * initial commit dedup * Revert "initial commit dedup" This reverts commit 38f586bec0ac9e4312b016e29c3aa0bd53f292b2. * initial version with a few tests, script, and config file * update changelog, add one test * add a 4th test, fix option names in the script * Fix name of component in config * remove option named with a number * add must_exist to input file argument * removed "default: null" from one of the arguments in config * remove utf8 characters from config * Update CHANGELOG.md --------- Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Samtools fastq (#52) * initial commit dedup * Revert "initial commit dedup" This reverts commit 38f586bec0ac9e4312b016e29c3aa0bd53f292b2. * Initial commit, config, script, help and test_data * Update changelog, add tests, fix argument naming errors, add test data * update changelog, remove gffread namespace field --------- Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * format URL in the description (#55) * format URL in the description * update changelog * Change name in _viash.yaml (#60) * Update operational code (#63) * update readme * switch ci to toolbox * update to viash 0.9.0-RC6 * edit keywords * fix version * update biobox * cutadapt (#7) * First commit, clone of cutadapt in htrnaseq + help.txt * Add config * Don't allow multiple: true when providing a FASTA file with adapters * First version of script * Updates and fixes - se/pe * Add tests and fix --json argument * Add software version * Better consistency in using snake_case * Update src/cutadapt/config.vsh.yaml Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Update src/cutadapt/config.vsh.yaml Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Update src/cutadapt/config.vsh.yaml Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Specify --input and --input_r2 as separate arguments * Avoid specifying default arg values * Add more information to `--minimum_length` and `maximum_length` * Add --cpus by means of $meta_cpus and set proper default * Allow multiple for adapters/fasta and add test * change multiple_sep to ';' * add example * simplify code with a helper function * create directories in test * use a different output extension if --fasta is provided * decrease code duplication by separating optional outputs from paired/unpaired output arguments * write custom tests for cutadapt * fix _r2 arguments * add debug flag as not to always print the cli command * remove comment * Update to Viash 0.9.0-RC4 * Ability to specify output globbing patterns * Avoid the need for both output_dir and output * Move fields from `info` to `links` Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Move references back to the info field * apologies, I proposed a wrong syntax --------- Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * update changelog * update readme * Update salmon quant arguments (#57) * Make index an optional argument * FIx argument type and add optional argument * FEAT: add bedtools getfasta. (#59) * FEAT: add bedtools getfasta. * Add PR number to CHANGELOG * Add star genomegenerate component (#58) * Add star genomegenerate component * Update changelog * Rename component * Update test * Update CHANGELOG.md --------- Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * fix package config (#65) * Delete src/bgzip directory (#64) It was moved to toolbox * Output alignments to the transcriptome (#56) * Output alignments to the transcriptome * Change argument name * BUG: pear component failure is ignored (#70) * FEAT + BUG: cutadapt; allowing disabling demultiplexing and fix par_quality_cutoff_r2 (#69) * FEAT: Disable cutadapt demultiplexing by default * Cutadapt: fix --par_quality_cutoff_r2 * FEAT: update busco to 5.7.1 (#72) * FEAT: update busco to 5.7.1 * Typo * Samtools fasta (#53) * initial commit dedup * Revert "initial commit dedup" This reverts commit 38f586bec0ac9e4312b016e29c3aa0bd53f292b2. * Fasta component * change script resource to samtools_fastq script, with dummy argument to specify the command * add dummy argument to samtools_fastq to share the script with samtools_fasta * fix path to script in config * Update src/samtools/samtools_fastq/script.sh Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Change default fields to examples * Two more default fields changed to examples * Minor formatting changes * Markdown formatting changes in configs --------- Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Umi tools dedup (#54) * initial commit dedup * Revert "initial commit dedup" This reverts commit 38f586bec0ac9e4312b016e29c3aa0bd53f292b2. * inital commit dedup * Working component with one test * Update test 1 and test data, fix some arg types in config and script * test data files and changes to script * Add third test and test data * Fix typo in script * remove utf8 characters in config * Add choices fields and change default fields to exampels * Minor formatting changes * md formatting changes in config * Fix typo (#79) * add vscode to gitignore * update multiple separator (#81) * update multiple separator * update changelog * Update src/multiqc/config.vsh.yaml Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Update src/multiqc/config.vsh.yaml Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Update src/multiqc/config.vsh.yaml Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * Update src/multiqc/config.vsh.yaml Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * update ifs --------- Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * add test data * add tests * update changelog * remove unrequired test data * update descriptions * update changelog * update help text * Update src/qualimap/qualimap_rnaseq/script.sh Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * update unit tests * update unit tests * addres pr changes request * add version * remove whitespace multiqc * Apply suggestions from code review Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> * address pr comments * Update CHANGELOG.md * fix doi * Fix name * update version and container image * write software version to file --------- Co-authored-by: dorien-er <roosen.dorien@gmail.com> Co-authored-by: Leila011 <leilapaquay@gmail.com> Co-authored-by: Robrecht Cannoodt <rcannood@gmail.com> Co-authored-by: emmarousseau <emmarou1@icloud.com> Co-authored-by: Sai Nirmayi Yasa <92786623+sainirmayi@users.noreply.github.com> Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com> Co-authored-by: Dorien <41797896+dorien-er@users.noreply.github.com>
2024-08-21 11:54:23 +00:00
* `qualimap/qualimap_rnaseq`: RNA-seq QC analysis using qualimap (PR #74).
* `rsem/rsem_prepare_reference`: Prepare transcript references for RSEM (PR #89).
* `bcftools`:
- `bcftools/bcftools_concat`: Concatenate or combine VCF/BCF files (PR #145).
- `bcftools/bcftools_norm`: Left-align and normalize indels, check if REF alleles match the reference, split multiallelic sites into multiple rows; recover multiallelics from multiple rows (PR #144).
- `bcftools/bcftools_annotate`: Add or remove annotations from a VCF/BCF file (PR #143).
- `bcftools/bcftools_stats`: Parses VCF or BCF and produces a txt stats file which can be plotted using plot-vcfstats (PR #142).
- `bcftools/bcftools_sort`: Sorts BCF/VCF files by position and other criteria (PR #141).
* `fastqc`: High throughput sequence quality control analysis tool (PR #92).
* `sortmerna`: Local sequence alignment tool for mapping, clustering, and filtering rRNA from
metatranscriptomic data (PR #146).
* `fq_subsample`: Sample a subset of records from single or paired FASTQ files (PR #147).
* `kallisto`:
- `kallisto_index`: Create a kallisto index (PR #149).
- `kallisto_quant`: Quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads (PR #152).
* `trimgalore`: Quality and adapter trimming for fastq files (PR #117).
## MINOR CHANGES
* `busco` components: update BUSCO to `5.7.1` (PR #72).
* Update CI to reusable workflow in `viash-io/viash-actions` (PR #86).
* Update several components in order to avoid duplicate code when using `unset` on boolean arguments (PR #133).
* Bump viash to `0.9.0-RC7` (PR #134)
## DOCUMENTATION
* Extend the contributing guidelines (PR #82):
- Update format to Viash 0.9.
- Descriptions should be formatted in markdown.
- Add defaults to descriptions, not as a default of the argument.
- Explain parameter expansion.
- Mention that the contents of the output of components in tests should be checked.
* Add authorship to existing components (PR #88).
## BUG FIXES
* `pear`: fix component not exiting with the correct exitcode when PEAR fails (PR #70).
* `cutadapt`: fix `--par_quality_cutoff_r2` argument (PR #69).
* `cutadapt`: demultiplexing is now disabled by default. It can be re-enabled by using `demultiplex_mode` (PR #69).
* `multiqc`: update multiple separator to `;` (PR #81).
# biobox 0.1.0
## NEW FEATURES
* `arriba`: Detect gene fusions from RNA-seq data (PR #1).
* `fastp`: An ultra-fast all-in-one FASTQ preprocessor (PR #3).
* `busco`:
- `busco/busco_run`: Assess genome assembly and annotation completeness with single copy orthologs (PR #6).
- `busco/busco_list_datasets`: Lists available busco datasets (PR #18).
- `busco/busco_download_datasets`: Download busco datasets (PR #19).
* `cutadapt`: Remove adapter sequences from high-throughput sequencing reads (PR #7).
* `featurecounts`: Assign sequence reads to genomic features (PR #11).
* `bgzip`: Add bgzip functionality to compress and decompress files (PR #13).
* `pear`: Paired-end read merger (PR #10).
* `lofreq/call`: Call variants from a BAM file (PR #17).
* `lofreq/indelqual`: Insert indel qualities into BAM file (PR #17).
* `multiqc`: Aggregate results from bioinformatics analyses across many samples into a single report (PR #42).
* `star`:
- `star/star_align_reads`: Align reads to a reference genome (PR #22).
- `star/star_genome_generate`: Generate a genome index for STAR alignment (PR #58).
* `gffread`: Validate, filter, convert and perform other operations on GFF files (PR #29).
* `salmon`:
- `salmon/salmon_index`: Create a salmon index for the transcriptome to use Salmon in the mapping-based mode (PR #24).
- `salmon/salmon_quant`: Transcript quantification from RNA-seq data (PR #24).
* `samtools`:
- `samtools/samtools_flagstat`: Counts the number of alignments in SAM/BAM/CRAM files for each FLAG type (PR #31).
- `samtools/samtools_idxstats`: Reports alignment summary statistics for a SAM/BAM/CRAM file (PR #32).
- `samtools/samtools_index`: Index SAM/BAM/CRAM files (PR #35).
- `samtools/samtools_sort`: Sort SAM/BAM/CRAM files (PR #36).
- `samtools/samtools_stats`: Reports alignment summary statistics for a BAM file (PR #39).
- `samtools/samtools_faidx`: Indexes FASTA files to enable random access to fasta and fastq files (PR #41).
- `samtools/samtools_collate`: Shuffles and groups reads in SAM/BAM/CRAM files together by their names (PR #42).
- `samtools/samtools_view`: Views and converts SAM/BAM/CRAM files (PR #48).
- `samtools/samtools_fastq`: Converts a SAM/BAM/CRAM file to FASTQ (PR #52).
- `samtools/samtools_fastq`: Converts a SAM/BAM/CRAM file to FASTA (PR #53).
* `umi_tools`:
- `umi_tools/umi_tools_extract`: Flexible removal of UMI sequences from fastq reads (PR #71).
- `umi_tools/umi_tools_prepareforrsem`: Fix paired-end reads in name sorted BAM file to prepare for RSEM (PR #148).
* `falco`: A C++ drop-in replacement of FastQC to assess the quality of sequence read data (PR #43).
* `bedtools`:
- `bedtools_getfasta`: extract sequences from a FASTA file for each of the
intervals defined in a BED/GFF/VCF file (PR #59).
* `bbmap`:
- `bbmap_bbsplit`: Split sequencing reads by mapping them to multiple references simultaneously (PR #138).
## MINOR CHANGES
* Uniformize component metadata (PR #23).
* Update to Viash 0.8.5 (PR #25).
* Update to Viash 0.9.0-RC3 (PR #51).
* Update to Viash 0.9.0-RC6 (PR #63).
* Switch to viash-hub/toolbox actions (PR #64).
## DOCUMENTATION
* Update README (PR #64).
## BUG FIXES
* Add escaping character before leading hashtag in the description field of the config file (PR #50).
* Format URL in biobase/bcl_convert description (PR #55).