Build branch add-agat-convert-spgff2tsv with version add-agat-convert-spgff2tsv (fa5ed70)
Build pipeline: viash-hub.biobox.add-agat-convert-spgff2tsv-t5ds2
Source commit: fa5ed706e5
Source message: add tests
This commit is contained in:
18
.gitignore
vendored
Normal file
18
.gitignore
vendored
Normal file
@@ -0,0 +1,18 @@
|
||||
*.DS_Store
|
||||
*__pycache__
|
||||
|
||||
# IDE ignores
|
||||
.idea/
|
||||
.vscode/
|
||||
|
||||
# R specific ignores
|
||||
.Rhistory
|
||||
.Rproj.user
|
||||
*.Rproj
|
||||
|
||||
# viash specific ignores
|
||||
target/
|
||||
|
||||
# nextflow specific ignores
|
||||
.nextflow*
|
||||
work
|
||||
135
CHANGELOG.md
Normal file
135
CHANGELOG.md
Normal file
@@ -0,0 +1,135 @@
|
||||
# biobox x.x.x
|
||||
|
||||
## BREAKING CHANGES
|
||||
|
||||
* `star/star_align_reads`: Change all arguments from `--camelCase` to `--snake_case` (PR #62).
|
||||
|
||||
* `star/star_genome_generate`: Change all arguments from `--camelCase` to `--snake_case` (PR #62).
|
||||
|
||||
## NEW FUNCTIONALITY
|
||||
|
||||
* `star/star_align_reads`: Add star solo related arguments (PR #62).
|
||||
|
||||
* `bd_rhapsody/bd_rhapsody_make_reference`: Create a reference for the BD Rhapsody pipeline (PR #75).
|
||||
|
||||
* `umitools/umitools_dedup`: Deduplicate reads based on the mapping co-ordinate and the UMI attached to the read (PR #54).
|
||||
|
||||
* `seqtk`:
|
||||
- `seqtk/seqtk_sample`: Subsamples sequences from FASTA/Q files (PR #68).
|
||||
- `seqtk/seqtk_subseq`: Extract the sequences (complete or subsequence) from the FASTA/FASTQ files
|
||||
based on a provided sequence IDs or region coordinates file (PR #85).
|
||||
|
||||
* `agat/agat_convert_sp_gff2gtf`: convert any GTF/GFF file into a proper GTF file (PR #76).
|
||||
|
||||
## MINOR CHANGES
|
||||
|
||||
* `busco` components: update BUSCO to `5.7.1` (PR #72).
|
||||
|
||||
* Update CI to reusable workflow in `viash-io/viash-actions` (PR #86).
|
||||
|
||||
## DOCUMENTATION
|
||||
|
||||
* Extend the contributing guidelines (PR #82):
|
||||
|
||||
- Update format to Viash 0.9.
|
||||
|
||||
- Descriptions should be formatted in markdown.
|
||||
|
||||
- Add defaults to descriptions, not as a default of the argument.
|
||||
|
||||
- Explain parameter expansion.
|
||||
|
||||
- Mention that the contents of the output of components in tests should be checked.
|
||||
|
||||
* Add authorship to existing components (PR #88).
|
||||
|
||||
## BUG FIXES
|
||||
|
||||
* `pear`: fix component not exiting with the correct exitcode when PEAR fails (PR #70).
|
||||
|
||||
* `cutadapt`: fix `--par_quality_cutoff_r2` argument (PR #69).
|
||||
|
||||
* `cutadapt`: demultiplexing is now disabled by default. It can be re-enabled by using `demultiplex_mode` (PR #69).
|
||||
|
||||
* `multiqc`: update multiple separator to `;` (PR #81).
|
||||
|
||||
|
||||
# biobox 0.1.0
|
||||
|
||||
## NEW FEATURES
|
||||
|
||||
* `arriba`: Detect gene fusions from RNA-seq data (PR #1).
|
||||
|
||||
* `fastp`: An ultra-fast all-in-one FASTQ preprocessor (PR #3).
|
||||
|
||||
* `busco`:
|
||||
- `busco/busco_run`: Assess genome assembly and annotation completeness with single copy orthologs (PR #6).
|
||||
- `busco/busco_list_datasets`: Lists available busco datasets (PR #18).
|
||||
- `busco/busco_download_datasets`: Download busco datasets (PR #19).
|
||||
|
||||
* `cutadapt`: Remove adapter sequences from high-throughput sequencing reads (PR #7).
|
||||
|
||||
* `featurecounts`: Assign sequence reads to genomic features (PR #11).
|
||||
|
||||
* `bgzip`: Add bgzip functionality to compress and decompress files (PR #13).
|
||||
|
||||
* `pear`: Paired-end read merger (PR #10).
|
||||
|
||||
* `lofreq/call`: Call variants from a BAM file (PR #17).
|
||||
|
||||
* `lofreq/indelqual`: Insert indel qualities into BAM file (PR #17).
|
||||
|
||||
* `multiqc`: Aggregate results from bioinformatics analyses across many samples into a single report (PR #42).
|
||||
|
||||
* `star`:
|
||||
- `star/star_align_reads`: Align reads to a reference genome (PR #22).
|
||||
- `star/star_genome_generate`: Generate a genome index for STAR alignment (PR #58).
|
||||
|
||||
* `gffread`: Validate, filter, convert and perform other operations on GFF files (PR #29).
|
||||
|
||||
* `salmon`:
|
||||
- `salmon/salmon_index`: Create a salmon index for the transcriptome to use Salmon in the mapping-based mode (PR #24).
|
||||
- `salmon/salmon_quant`: Transcript quantification from RNA-seq data (PR #24).
|
||||
|
||||
* `samtools`:
|
||||
- `samtools/samtools_flagstat`: Counts the number of alignments in SAM/BAM/CRAM files for each FLAG type (PR #31).
|
||||
- `samtools/samtools_idxstats`: Reports alignment summary statistics for a SAM/BAM/CRAM file (PR #32).
|
||||
- `samtools/samtools_index`: Index SAM/BAM/CRAM files (PR #35).
|
||||
- `samtools/samtools_sort`: Sort SAM/BAM/CRAM files (PR #36).
|
||||
- `samtools/samtools_stats`: Reports alignment summary statistics for a BAM file (PR #39).
|
||||
- `samtools/samtools_faidx`: Indexes FASTA files to enable random access to fasta and fastq files (PR #41).
|
||||
- `samtools/samtools_collate`: Shuffles and groups reads in SAM/BAM/CRAM files together by their names (PR #42).
|
||||
- `samtools/samtools_view`: Views and converts SAM/BAM/CRAM files (PR #48).
|
||||
- `samtools/samtools_fastq`: Converts a SAM/BAM/CRAM file to FASTQ (PR #52).
|
||||
- `samtools/samtools_fastq`: Converts a SAM/BAM/CRAM file to FASTA (PR #53).
|
||||
|
||||
* `umi_tools`:
|
||||
-`umi_tools/umi_tools_extract`: Flexible removal of UMI sequences from fastq reads (PR #71).
|
||||
|
||||
* `falco`: A C++ drop-in replacement of FastQC to assess the quality of sequence read data (PR #43).
|
||||
|
||||
* `bedtools`:
|
||||
- `bedtools_getfasta`: extract sequences from a FASTA file for each of the
|
||||
intervals defined in a BED/GFF/VCF file (PR #59).
|
||||
|
||||
## MINOR CHANGES
|
||||
|
||||
* Uniformize component metadata (PR #23).
|
||||
|
||||
* Update to Viash 0.8.5 (PR #25).
|
||||
|
||||
* Update to Viash 0.9.0-RC3 (PR #51).
|
||||
|
||||
* Update to Viash 0.9.0-RC6 (PR #63).
|
||||
|
||||
* Switch to viash-hub/toolbox actions (PR #64).
|
||||
|
||||
## DOCUMENTATION
|
||||
|
||||
* Update README (PR #64).
|
||||
|
||||
## BUG FIXES
|
||||
|
||||
* Add escaping character before leading hashtag in the description field of the config file (PR #50).
|
||||
|
||||
* Format URL in biobase/bcl_convert description (PR #55).
|
||||
414
CONTRIBUTING.md
Normal file
414
CONTRIBUTING.md
Normal file
@@ -0,0 +1,414 @@
|
||||
|
||||
# Contributing guidelines
|
||||
|
||||
We encourage contributions from the community. To contribute:
|
||||
|
||||
1. **Fork the Repository**: Start by forking this repository to your account.
|
||||
2. **Develop Your Component**: Create your Viash component, ensuring it aligns with our best practices (detailed below).
|
||||
3. **Submit a Pull Request**: After testing your component, submit a pull request for review.
|
||||
|
||||
## Procedure of adding a component
|
||||
|
||||
### Step 1: Find a component to contribute
|
||||
|
||||
* Find a tool to contribute to this repo.
|
||||
|
||||
* Check whether it is already in the [Project board](https://github.com/orgs/viash-hub/projects/1).
|
||||
|
||||
* Check whether there is a corresponding [Snakemake wrapper](https://github.com/snakemake/snakemake-wrappers/blob/master/bio) or [nf-core module](https://github.com/nf-core/modules/tree/master/modules/nf-core) which we can use as inspiration.
|
||||
|
||||
* Create an issue to show that you are working on this component.
|
||||
|
||||
|
||||
### Step 2: Add config template
|
||||
|
||||
Change all occurrences of `xxx` to the name of the component.
|
||||
|
||||
Create a file at `src/xxx/config.vsh.yaml` with contents:
|
||||
|
||||
```yaml
|
||||
name: xxx
|
||||
description: xxx
|
||||
keywords: [tag1, tag2]
|
||||
links:
|
||||
homepage: yyy
|
||||
documentation: yyy
|
||||
issue_tracker: yyy
|
||||
repository: yyy
|
||||
references:
|
||||
doi: 12345/12345678.yz
|
||||
license: MIT/Apache-2.0/GPL-3.0/...
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments: <...>
|
||||
- name: Outputs
|
||||
arguments: <...>
|
||||
- name: Arguments
|
||||
arguments: <...>
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- type: file
|
||||
path: test_data
|
||||
engines:
|
||||
- <...>
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
```
|
||||
|
||||
### Step 3: Fill in the metadata
|
||||
|
||||
Fill in the relevant metadata fields in the config. Here is an example of the metadata of an existing component.
|
||||
|
||||
```yaml
|
||||
name: arriba
|
||||
description: Detect gene fusions from RNA-Seq data
|
||||
keywords: [Gene fusion, RNA-Seq]
|
||||
links:
|
||||
homepage: https://arriba.readthedocs.io/en/latest/
|
||||
documentation: https://arriba.readthedocs.io/en/latest/
|
||||
repository: https://github.com/suhrig/arriba
|
||||
issue_tracker: https://github.com/suhrig/arriba/issues
|
||||
references:
|
||||
doi: 10.1101/gr.257246.119
|
||||
bibtex: |
|
||||
@article{
|
||||
... a bibtex entry in case the doi is not available ...
|
||||
}
|
||||
license: MIT
|
||||
```
|
||||
|
||||
### Step 4: Find a suitable container
|
||||
|
||||
Google `biocontainer <name of component>` and find the container that is most suitable. Typically the link will be `https://quay.io/repository/biocontainers/xxx?tab=tags`.
|
||||
|
||||
If no such container is found, you can create a custom container in the next step.
|
||||
|
||||
|
||||
### Step 5: Create help file
|
||||
|
||||
To help develop the component, we store the `--help` output of the tool in a file at `src/xxx/help.txt`.
|
||||
|
||||
````bash
|
||||
cat <<EOF > src/xxx/help.txt
|
||||
```sh
|
||||
xxx --help
|
||||
```
|
||||
EOF
|
||||
|
||||
docker run quay.io/biocontainers/xxx:tag xxx --help >> src/xxx/help.txt
|
||||
````
|
||||
|
||||
Notes:
|
||||
|
||||
* This help file has no functional purpose, but it is useful for the developer to see the help output of the tool.
|
||||
|
||||
* Some tools might not have a `--help` argument but instead have a `-h` argument. For example, for `arriba`, the help message is obtained by running `arriba -h`:
|
||||
|
||||
```bash
|
||||
docker run quay.io/biocontainers/arriba:2.4.0--h0033a41_2 arriba -h
|
||||
```
|
||||
|
||||
|
||||
### Step 6: Create or fetch test data
|
||||
|
||||
To help develop the component, it's interesting to have some test data available. In most cases, we can use the test data from the Snakemake wrappers.
|
||||
|
||||
To make sure we can reproduce the test data in the future, we store the command to fetch the test data in a file at `src/xxx/test_data/script.sh`.
|
||||
|
||||
```bash
|
||||
cat <<EOF > src/xxx/test_data/script.sh
|
||||
|
||||
# clone repo
|
||||
if [ ! -d /tmp/snakemake-wrappers ]; then
|
||||
git clone --depth 1 --single-branch --branch master https://github.com/snakemake/snakemake-wrappers /tmp/snakemake-wrappers
|
||||
fi
|
||||
|
||||
# copy test data
|
||||
cp -r /tmp/snakemake-wrappers/bio/xxx/test/* src/xxx/test_data
|
||||
EOF
|
||||
```
|
||||
|
||||
The test data should be suitable for testing this component. Ensure that the test data is small enough: ideally <1KB, preferably <10KB, if need be <100KB.
|
||||
|
||||
### Step 7: Add arguments for the input files
|
||||
|
||||
By looking at the help file, we add the input arguments to the config file. Here is an example of the input arguments of an existing component.
|
||||
|
||||
For instance, in the [arriba help file](src/arriba/help.txt), we see the following:
|
||||
|
||||
Usage: arriba [-c Chimeric.out.sam] -x Aligned.out.bam \
|
||||
-g annotation.gtf -a assembly.fa [-b blacklists.tsv] [-k known_fusions.tsv] \
|
||||
[-t tags.tsv] [-p protein_domains.gff3] [-d structural_variants_from_WGS.tsv] \
|
||||
-o fusions.tsv [-O fusions.discarded.tsv] \
|
||||
[OPTIONS]
|
||||
|
||||
-x FILE File in SAM/BAM/CRAM format with main alignments as generated by STAR
|
||||
(Aligned.out.sam). Arriba extracts candidate reads from this file.
|
||||
|
||||
Based on this information, we can add the following input arguments to the config file.
|
||||
|
||||
```yaml
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments:
|
||||
- name: --bam
|
||||
alternatives: -x
|
||||
type: file
|
||||
description: |
|
||||
File in SAM/BAM/CRAM format with main alignments as generated by STAR
|
||||
(`Aligned.out.sam`). Arriba extracts candidate reads from this file.
|
||||
required: true
|
||||
example: Aligned.out.bam
|
||||
```
|
||||
|
||||
Check the [documentation](https://viash.io/reference/config/functionality/arguments) for more information on the format of input arguments.
|
||||
|
||||
Several notes:
|
||||
|
||||
* Argument names should be formatted in `--snake_case`. This means arguments like `--foo-bar` should be formatted as `--foo_bar`, and short arguments like `-f` should receive a longer name like `--foo`.
|
||||
|
||||
* Input arguments can have `multiple: true` to allow the user to specify multiple files.
|
||||
|
||||
* The description should be formatted in markdown.
|
||||
|
||||
### Step 8: Add arguments for the output files
|
||||
|
||||
By looking at the help file, we now also add output arguments to the config file.
|
||||
|
||||
For example, in the [arriba help file](src/arriba/help.txt), we see the following:
|
||||
|
||||
|
||||
Usage: arriba [-c Chimeric.out.sam] -x Aligned.out.bam \
|
||||
-g annotation.gtf -a assembly.fa [-b blacklists.tsv] [-k known_fusions.tsv] \
|
||||
[-t tags.tsv] [-p protein_domains.gff3] [-d structural_variants_from_WGS.tsv] \
|
||||
-o fusions.tsv [-O fusions.discarded.tsv] \
|
||||
[OPTIONS]
|
||||
|
||||
-o FILE Output file with fusions that have passed all filters.
|
||||
|
||||
-O FILE Output file with fusions that were discarded due to filtering.
|
||||
|
||||
Based on this information, we can add the following output arguments to the config file.
|
||||
|
||||
```yaml
|
||||
argument_groups:
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- name: --fusions
|
||||
alternatives: -o
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Output file with fusions that have passed all filters.
|
||||
required: true
|
||||
example: fusions.tsv
|
||||
- name: --fusions_discarded
|
||||
alternatives: -O
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Output file with fusions that were discarded due to filtering.
|
||||
required: false
|
||||
example: fusions.discarded.tsv
|
||||
```
|
||||
|
||||
Note:
|
||||
|
||||
* Preferably, these outputs should not be directories but files. For example, if a tool outputs a directory `foo/` containing files `foo/bar.txt` and `foo/baz.txt`, there should be two output arguments `--bar` and `--baz` (as opposed to one output argument which outputs the whole `foo/` directory).
|
||||
|
||||
### Step 9: Add arguments for the other arguments
|
||||
|
||||
Finally, add all other arguments to the config file. There are a few exceptions:
|
||||
|
||||
* Arguments related to specifying CPU and memory requirements are handled separately and should not be added to the config file.
|
||||
|
||||
* Arguments related to printing the information such as printing the version (`-v`, `--version`) or printing the help (`-h`, `--help`) should not be added to the config file.
|
||||
|
||||
* If the help lists defaults, do not add them as defaults but to the description. Example: `description: <Explanation of parameter>. Default: 10.`
|
||||
|
||||
|
||||
### Step 10: Add a Docker engine
|
||||
|
||||
To ensure reproducibility of components, we require that all components are run in a Docker container.
|
||||
|
||||
```yaml
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/xxx:0.1.0--py_0
|
||||
```
|
||||
|
||||
The container should have your tool installed, as well as `ps`.
|
||||
|
||||
If you didn't find a suitable container in the previous step, you can create a custom container. For example:
|
||||
|
||||
```yaml
|
||||
engines:
|
||||
- type: docker
|
||||
image: python:3.10
|
||||
setup:
|
||||
- type: python
|
||||
packages: numpy
|
||||
```
|
||||
|
||||
For more information on how to do this, see the [documentation](https://viash.io/guide/component/add-dependencies.html#steps-for-creating-a-custom-docker-platform).
|
||||
|
||||
Here is a list of base containers we can recommend:
|
||||
|
||||
* Bash: [`bash`](https://hub.docker.com/_/bash), [`ubuntu`](https://hub.docker.com/_/ubuntu)
|
||||
* C#: [`ghcr.io/data-intuitive/dotnet-script`](https://github.com/data-intuitive/ghcr-dotnet-script/pkgs/container/dotnet-script)
|
||||
* JavaScript: [`node`](https://hub.docker.com/_/node)
|
||||
* Python: [`python`](https://hub.docker.com/_/python), [`nvcr.io/nvidia/pytorch`](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
|
||||
* R: [`eddelbuettel/r2u`](https://hub.docker.com/r/eddelbuettel/r2u), [`rocker/tidyverse`](https://hub.docker.com/r/rocker/tidyverse)
|
||||
* Scala: [`sbtscala/scala-sbt`](https://hub.docker.com/r/sbtscala/scala-sbt)
|
||||
|
||||
### Step 11: Write a runner script
|
||||
|
||||
Next, we need to write a runner script that runs the tool with the input arguments. Create a Bash script named `src/xxx/script.sh` which runs the tool with the input arguments.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
# unset flags
|
||||
[[ "$par_option" == "false" ]] && unset par_option
|
||||
|
||||
xxx \
|
||||
--input "$par_input" \
|
||||
--output "$par_output" \
|
||||
${par_option:+--option}
|
||||
```
|
||||
|
||||
When building a Viash component, Viash will automatically replace the `## VIASH START` and `## VIASH END` lines (and anything in between) with environment variables based on the arguments specified in the config.
|
||||
|
||||
As an example, this is what the Bash script for the `arriba` component looks like:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
# unset flags
|
||||
[[ "$par_skip_duplicate_marking" == "false" ]] && unset par_skip_duplicate_marking
|
||||
[[ "$par_extra_information" == "false" ]] && unset par_extra_information
|
||||
[[ "$par_fill_gaps" == "false" ]] && unset par_fill_gaps
|
||||
|
||||
arriba \
|
||||
-x "$par_bam" \
|
||||
-a "$par_genome" \
|
||||
-g "$par_gene_annotation" \
|
||||
-o "$par_fusions" \
|
||||
${par_known_fusions:+-k "${par_known_fusions}"} \
|
||||
${par_blacklist:+-b "${par_blacklist}"} \
|
||||
# ...
|
||||
${par_extra_information:+-X} \
|
||||
${par_fill_gaps:+-I}
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
* If your arguments can contain special variables (e.g. `$`), you can use quoting (need to find a documentation page for this) to make sure you can use the string as input. Example: `-x ${par_bam@Q}`.
|
||||
|
||||
* Optional arguments can be passed to the command conditionally using Bash [parameter expansion](https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html). For example: `${par_known_fusions:+-k ${par_known_fusions@Q}}`
|
||||
|
||||
* If your tool allows for multiple inputs using a separator other than `;` (which is the default Viash multiple separator), you can substitute these values with a command like: `par_disable_filters=$(echo $par_disable_filters | tr ';' ',')`.
|
||||
|
||||
|
||||
### Step 12: Create test script
|
||||
|
||||
If the unit test requires test resources, these should be provided in the `test_resources` section of the component.
|
||||
|
||||
```yaml
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- type: file
|
||||
path: test_data
|
||||
```
|
||||
|
||||
Create a test script at `src/xxx/test.sh` that runs the component with the test data. This script should run the component (available with `$meta_executable`) with the test data and check if the output is as expected. The script should exit with a non-zero exit code if the output is not as expected. For example:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
#############################################
|
||||
# helper functions
|
||||
assert_file_exists() {
|
||||
[ -f "$1" ] || { echo "File '$1' does not exist" && exit 1; }
|
||||
}
|
||||
assert_file_doesnt_exist() {
|
||||
[ ! -f "$1" ] || { echo "File '$1' exists but shouldn't" && exit 1; }
|
||||
}
|
||||
assert_file_empty() {
|
||||
[ ! -s "$1" ] || { echo "File '$1' is not empty but should be" && exit 1; }
|
||||
}
|
||||
assert_file_not_empty() {
|
||||
[ -s "$1" ] || { echo "File '$1' is empty but shouldn't be" && exit 1; }
|
||||
}
|
||||
assert_file_contains() {
|
||||
grep -q "$2" "$1" || { echo "File '$1' does not contain '$2'" && exit 1; }
|
||||
}
|
||||
assert_file_not_contains() {
|
||||
grep -q "$2" "$1" && { echo "File '$1' contains '$2' but shouldn't" && exit 1; }
|
||||
}
|
||||
assert_file_contains_regex() {
|
||||
grep -q -E "$2" "$1" || { echo "File '$1' does not contain '$2'" && exit 1; }
|
||||
}
|
||||
assert_file_not_contains_regex() {
|
||||
grep -q -E "$2" "$1" && { echo "File '$1' contains '$2' but shouldn't" && exit 1; }
|
||||
}
|
||||
#############################################
|
||||
|
||||
echo "> Run $meta_name with test data"
|
||||
"$meta_executable" \
|
||||
--input "$meta_resources_dir/test_data/reads_R1.fastq" \
|
||||
--output "output.txt" \
|
||||
--option
|
||||
|
||||
echo ">> Check if output exists"
|
||||
assert_file_exists "output.txt"
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
assert_file_not_empty "output.txt"
|
||||
|
||||
echo ">> Check if output is correct"
|
||||
assert_file_contains "output.txt" "some expected output"
|
||||
|
||||
echo "> All tests succeeded!"
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
* Do always check the contents of the output file. If the output is not deterministic, you can use regular expressions to check the output.
|
||||
|
||||
* If possible, generate your own test data instead of copying it from an external resource.
|
||||
|
||||
### Step 13: Create a `/var/software_versions.txt` file
|
||||
|
||||
For the sake of transparency and reproducibility, we require that the versions of the software used in the component are documented.
|
||||
|
||||
For now, this is managed by creating a file `/var/software_versions.txt` in the `setup` section of the Docker engine.
|
||||
|
||||
```yaml
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/xxx:0.1.0--py_0
|
||||
setup:
|
||||
- type: docker
|
||||
# note: /var/software_versions.txt should contain:
|
||||
# arriba: "2.4.0"
|
||||
run: |
|
||||
echo "xxx: \"0.1.0\"" > /var/software_versions.txt
|
||||
```
|
||||
21
LICENSE
Normal file
21
LICENSE
Normal file
@@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2024 Data Intuitive
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
72
README.md
Normal file
72
README.md
Normal file
@@ -0,0 +1,72 @@
|
||||
|
||||
|
||||
# 🌱📦 biobox
|
||||
|
||||
[](https://web.viash-hub.com/packages/biobox)
|
||||
[](https://github.com/viash-hub/biobox)
|
||||
[](https://github.com/viash-hub/biobox/blob/main/LICENSE)
|
||||
[](https://github.com/viash-hub/biobox/issues)
|
||||
[](https://viash.io)
|
||||
|
||||
A collection of bioinformatics tools for working with sequence data.
|
||||
|
||||
## Objectives
|
||||
|
||||
- **Reusability**: Facilitating the use of components across various
|
||||
projects and contexts.
|
||||
- **Reproducibility**: Ensuring that components are reproducible and can
|
||||
be easily shared.
|
||||
- **Best Practices**: Adhering to established standards in software
|
||||
development and bioinformatics.
|
||||
|
||||
## Contributing
|
||||
|
||||
We encourage contributions from the community. To contribute:
|
||||
|
||||
1. **Fork the Repository**: Start by forking this repository to your
|
||||
account.
|
||||
2. **Develop Your Component**: Create your Viash component, ensuring it
|
||||
aligns with our best practices (detailed below).
|
||||
3. **Submit a Pull Request**: After testing your component, submit a
|
||||
pull request for review.
|
||||
|
||||
## Contribution Guidelines
|
||||
|
||||
The contribution guidelines describes which steps you should follow to
|
||||
contribute a component to this repository.
|
||||
|
||||
1. Find a component to contribute
|
||||
2. Add config template
|
||||
3. Fill in the metadata
|
||||
4. Find a suitable container
|
||||
5. Create help file
|
||||
6. Create or fetch test data
|
||||
7. Add arguments for the input files
|
||||
8. Add arguments for the output files
|
||||
9. Add arguments for the other arguments
|
||||
10. Add a Docker engine
|
||||
11. Write a runner script
|
||||
12. Create test script
|
||||
13. Create a `/var/software_versions.txt` file
|
||||
|
||||
See the
|
||||
[CONTRIBUTING](https://github.com/viash-hub/biobox/blob/main/CONTRIBUTING.md)
|
||||
file for more details.
|
||||
|
||||
## Support and Community
|
||||
|
||||
For support, questions, or to join our community:
|
||||
|
||||
- **Issues**: Submit questions or issues via the [GitHub issue
|
||||
tracker](https://github.com/viash-hub/biobox/issues).
|
||||
- **Discussions**: Join our discussions via [GitHub
|
||||
Discussions](https://github.com/viash-hub/biobox/discussions).
|
||||
|
||||
## License
|
||||
|
||||
This repository is licensed under an MIT license. See the
|
||||
[LICENSE](https://github.com/viash-hub/biobox/blob/main/LICENSE) file
|
||||
for details.
|
||||
62
README.qmd
Normal file
62
README.qmd
Normal file
@@ -0,0 +1,62 @@
|
||||
---
|
||||
format: gfm
|
||||
---
|
||||
```{r setup, include=FALSE}
|
||||
project <- yaml::read_yaml("_viash.yaml")
|
||||
license <- paste0(project$links$repository, "/blob/main/LICENSE")
|
||||
contributing <- paste0(project$links$repository, "/blob/main/CONTRIBUTING.md")
|
||||
```
|
||||
# 🌱📦 `r project$name`
|
||||
|
||||
[](https://web.viash-hub.com/packages/`r project$name`)
|
||||
[](`r project$links$repository`)
|
||||
[](`r license`)
|
||||
[](`r project$links$issue_tracker`)
|
||||
[`-blue)](https://viash.io)
|
||||
|
||||
`r project$description`
|
||||
|
||||
## Objectives
|
||||
|
||||
- **Reusability**: Facilitating the use of components across various projects and contexts.
|
||||
- **Reproducibility**: Ensuring that components are reproducible and can be easily shared.
|
||||
- **Best Practices**: Adhering to established standards in software development and bioinformatics.
|
||||
|
||||
## Contributing
|
||||
|
||||
We encourage contributions from the community. To contribute:
|
||||
|
||||
1. **Fork the Repository**: Start by forking this repository to your account.
|
||||
2. **Develop Your Component**: Create your Viash component, ensuring it aligns with our best practices (detailed below).
|
||||
3. **Submit a Pull Request**: After testing your component, submit a pull request for review.
|
||||
|
||||
## Contribution Guidelines
|
||||
|
||||
The contribution guidelines describes which steps you should follow to contribute a component to this repository.
|
||||
|
||||
```{r echo=FALSE}
|
||||
lines <- readr::read_lines("CONTRIBUTING.md")
|
||||
|
||||
index_start <- grep("^### Step [0-9]*:", lines)
|
||||
|
||||
index_end <- c(index_start[-1] - 1, length(lines))
|
||||
|
||||
name <- gsub("^### Step [0-9]*: *", "", lines[index_start])
|
||||
|
||||
knitr::asis_output(
|
||||
paste(paste0(" 1. ", name, "\n"), collapse = "")
|
||||
)
|
||||
```
|
||||
|
||||
See the [CONTRIBUTING](`r contributing`) file for more details.
|
||||
|
||||
|
||||
## Support and Community
|
||||
|
||||
For support, questions, or to join our community:
|
||||
|
||||
- **Issues**: Submit questions or issues via the [GitHub issue tracker](`r project$links$issue_tracker`).
|
||||
- **Discussions**: Join our discussions via [GitHub Discussions](`r project$links$repository`/discussions).
|
||||
|
||||
## License
|
||||
This repository is licensed under an MIT license. See the [LICENSE](`r license`) file for details.
|
||||
13
_viash.yaml
Normal file
13
_viash.yaml
Normal file
@@ -0,0 +1,13 @@
|
||||
name: biobox
|
||||
description: |
|
||||
A collection of bioinformatics tools for working with sequence data.
|
||||
license: MIT
|
||||
keywords: [bioinformatics, modules, sequencing]
|
||||
links:
|
||||
issue_tracker: https://github.com/viash-hub/biobox/issues
|
||||
repository: https://github.com/viash-hub/biobox
|
||||
|
||||
viash_version: 0.9.0-RC6
|
||||
|
||||
config_mods: |
|
||||
.requirements.commands := ['ps']
|
||||
3
main.nf
Normal file
3
main.nf
Normal file
@@ -0,0 +1,3 @@
|
||||
workflow {
|
||||
print("This is a dummy placeholder for pipeline execution. Please use the corresponding nf files for running pipelines.")
|
||||
}
|
||||
6
nextflow.config
Normal file
6
nextflow.config
Normal file
@@ -0,0 +1,6 @@
|
||||
manifest {
|
||||
name = "biobox"
|
||||
version = "add-agat-convert-spgff2tsv"
|
||||
defaultBranch = "main"
|
||||
nextflowVersion = "!>=20.12.1-edge"
|
||||
}
|
||||
14
src/_authors/angela_o_pisco.yaml
Normal file
14
src/_authors/angela_o_pisco.yaml
Normal file
@@ -0,0 +1,14 @@
|
||||
name: Angela Oliveira Pisco
|
||||
info:
|
||||
role: Contributor
|
||||
links:
|
||||
github: aopisco
|
||||
orcid: "0000-0003-0142-2355"
|
||||
linkedin: aopisco
|
||||
organizations:
|
||||
- name: Insitro
|
||||
href: https://insitro.com
|
||||
role: Director of Computational Biology
|
||||
- name: Open Problems
|
||||
href: https://openproblems.bio
|
||||
role: Core Member
|
||||
10
src/_authors/dorien_roosen.yaml
Normal file
10
src/_authors/dorien_roosen.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
name: Dorien Roosen
|
||||
info:
|
||||
links:
|
||||
email: dorien@data-intuitive.com
|
||||
github: dorien-er
|
||||
linkedin: dorien-roosen
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Data Scientist
|
||||
11
src/_authors/dries_schaumont.yaml
Normal file
11
src/_authors/dries_schaumont.yaml
Normal file
@@ -0,0 +1,11 @@
|
||||
name: Dries Schaumont
|
||||
info:
|
||||
links:
|
||||
email: dries@data-intuitive.com
|
||||
github: DriesSchaumont
|
||||
orcid: "0000-0002-4389-0440"
|
||||
linkedin: dries-schaumont
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Data Scientist
|
||||
10
src/_authors/emma_rousseau.yaml
Normal file
10
src/_authors/emma_rousseau.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
name: Emma Rousseau
|
||||
info:
|
||||
links:
|
||||
email: emma@data-intuitive.com
|
||||
github: emmarousseau
|
||||
linkedin: emmarousseau1
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Bioinformatician
|
||||
10
src/_authors/jakub_majercik.yaml
Normal file
10
src/_authors/jakub_majercik.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
name: Jakub Majercik
|
||||
info:
|
||||
links:
|
||||
email: jakub@data-intuitive.com
|
||||
github: jakubmajercik
|
||||
linkedin: jakubmajercik
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Bioinformatics Engineer
|
||||
14
src/_authors/kai_waldrant.yaml
Normal file
14
src/_authors/kai_waldrant.yaml
Normal file
@@ -0,0 +1,14 @@
|
||||
name: Kai Waldrant
|
||||
info:
|
||||
links:
|
||||
email: kai@data-intuitive.com
|
||||
github: KaiWaldrant
|
||||
orcid: "0009-0003-8555-1361"
|
||||
linkedin: kaiwaldrant
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Bioinformatician
|
||||
- name: Open Problems
|
||||
href: https://openproblems.bio
|
||||
role: Contributor
|
||||
10
src/_authors/leila_paquay.yaml
Normal file
10
src/_authors/leila_paquay.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
name: Leïla Paquay
|
||||
info:
|
||||
links:
|
||||
email: leila@data-intuitive.com
|
||||
github: Leila011
|
||||
linkedin: leilapaquay
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Software Developer
|
||||
14
src/_authors/robrecht_cannoodt.yaml
Normal file
14
src/_authors/robrecht_cannoodt.yaml
Normal file
@@ -0,0 +1,14 @@
|
||||
name: Robrecht Cannoodt
|
||||
info:
|
||||
links:
|
||||
email: robrecht@data-intuitive.com
|
||||
github: rcannood
|
||||
orcid: "0000-0003-3641-729X"
|
||||
linkedin: robrechtcannoodt
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Data Science Engineer
|
||||
- name: Open Problems
|
||||
href: https://openproblems.bio
|
||||
role: Core Member
|
||||
10
src/_authors/sai_nirmayi_yasa.yaml
Normal file
10
src/_authors/sai_nirmayi_yasa.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
name: Sai Nirmayi Yasa
|
||||
info:
|
||||
links:
|
||||
email: nirmayi@data-intuitive.com
|
||||
github: sainirmayi
|
||||
linkedin: sai-nirmayi-yasa
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Junior Bioinformatics Researcher
|
||||
10
src/_authors/theodoro_gasperin.yaml
Normal file
10
src/_authors/theodoro_gasperin.yaml
Normal file
@@ -0,0 +1,10 @@
|
||||
name: Theodoro Gasperin Terra Camargo
|
||||
info:
|
||||
links:
|
||||
email: theodorogtc@gmail.com
|
||||
github: tgaspe
|
||||
linkedin: theodoro-gasperin-terra-camargo
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Bioinformatician
|
||||
9
src/_authors/toni_verbeiren.yaml
Normal file
9
src/_authors/toni_verbeiren.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
name: Toni Verbeiren
|
||||
info:
|
||||
links:
|
||||
github: tverbeiren
|
||||
linkedin: verbeiren
|
||||
organizations:
|
||||
- name: Data Intuitive
|
||||
href: https://www.data-intuitive.com
|
||||
role: Data Scientist and CEO
|
||||
5
src/_authors/weiwei_schultz.yaml
Normal file
5
src/_authors/weiwei_schultz.yaml
Normal file
@@ -0,0 +1,5 @@
|
||||
name: Weiwei Schultz
|
||||
info:
|
||||
organizations:
|
||||
- name: Janssen R&D US
|
||||
role: Associate Director Data Sciences
|
||||
94
src/agat/agat_convert_sp_gff2gtf/config.vsh.yaml
Normal file
94
src/agat/agat_convert_sp_gff2gtf/config.vsh.yaml
Normal file
@@ -0,0 +1,94 @@
|
||||
name: agat_convert_sp_gff2gtf
|
||||
namespace: agat
|
||||
description: |
|
||||
The script aims to convert any GTF/GFF file into a proper GTF file. Full
|
||||
information about the format can be found here:
|
||||
https://agat.readthedocs.io/en/latest/gxf.html You can choose among 7
|
||||
different GTF types (1, 2, 2.1, 2.2, 2.5, 3 or relax). Depending the
|
||||
version selected the script will filter out the features that are not
|
||||
accepted. For GTF2.5 and 3, every level1 feature (e.g nc_gene
|
||||
pseudogene) will be converted into gene feature and every level2 feature
|
||||
(e.g mRNA ncRNA) will be converted into transcript feature. Using the
|
||||
"relax" option you will produce a GTF-like output keeping all original
|
||||
feature types (3rd column). No modification will occur e.g. mRNA to
|
||||
transcript.
|
||||
|
||||
To be fully GTF compliant all feature have a gene_id and a transcript_id
|
||||
attribute. The gene_id is unique identifier for the genomic source of
|
||||
the transcript, which is used to group transcripts into genes. The
|
||||
transcript_id is a unique identifier for the predicted transcript, which
|
||||
is used to group features into transcripts.
|
||||
keywords: [gene annotations, GTF conversion]
|
||||
links:
|
||||
homepage: https://github.com/NBISweden/AGAT
|
||||
documentation: https://agat.readthedocs.io/
|
||||
issue_tracker: https://github.com/NBISweden/AGAT/issues
|
||||
repository: https://github.com/NBISweden/AGAT
|
||||
references:
|
||||
doi: 10.5281/zenodo.3552717
|
||||
license: GPL-3.0
|
||||
authors:
|
||||
- __merge__: /src/_authors/leila_paquay.yaml
|
||||
roles: [ author, maintainer ]
|
||||
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments:
|
||||
- name: --gff
|
||||
alternatives: [-i]
|
||||
description: Input GFF/GTF file that will be read
|
||||
type: file
|
||||
required: true
|
||||
direction: input
|
||||
example: input.gff
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- name: --output
|
||||
alternatives: [-o, --out, --outfile, --gtf]
|
||||
description: Output GTF file. If no output file is specified, the output will be written to STDOUT.
|
||||
type: file
|
||||
direction: output
|
||||
required: true
|
||||
example: output.gtf
|
||||
- name: Arguments
|
||||
arguments:
|
||||
- name: --gtf_version
|
||||
description: |
|
||||
Version of the GTF output (1,2,2.1,2.2,2.5,3 or relax). Default value from AGAT config file (relax for the default config). The script option has the higher priority.
|
||||
|
||||
* relax: all feature types are accepted.
|
||||
* GTF3 (9 feature types accepted): gene, transcript, exon, CDS, Selenocysteine, start_codon, stop_codon, three_prime_utr and five_prime_utr.
|
||||
* GTF2.5 (8 feature types accepted): gene, transcript, exon, CDS, UTR, start_codon, stop_codon, Selenocysteine.
|
||||
* GTF2.2 (9 feature types accepted): CDS, start_codon, stop_codon, 5UTR, 3UTR, inter, inter_CNS, intron_CNS and exon.
|
||||
* GTF2.1 (6 feature types accepted): CDS, start_codon, stop_codon, exon, 5UTR, 3UTR.
|
||||
* GTF2 (4 feature types accepted): CDS, start_codon, stop_codon, exon.
|
||||
* GTF1 (5 feature types accepted): CDS, start_codon, stop_codon, exon, intron.
|
||||
type: string
|
||||
choices: [relax, "1", "2", "2.1", "2.2", "2.5", "3"]
|
||||
required: false
|
||||
example: "3"
|
||||
- name: --config
|
||||
alternatives: [-c]
|
||||
description: |
|
||||
Input agat config file. By default AGAT takes as input agat_config.yaml file from the working directory if any, otherwise it takes the orignal agat_config.yaml shipped with AGAT. To get the agat_config.yaml locally type: "agat config --expose". The --config option gives you the possibility to use your own AGAT config file (located elsewhere or named differently).
|
||||
type: file
|
||||
required: false
|
||||
example: custom_agat_config.yaml
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- type: file
|
||||
path: test_data
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/agat:1.4.0--pl5321hdfd78af_0
|
||||
setup:
|
||||
- type: docker
|
||||
run: |
|
||||
agat --version | sed 's/AGAT\s\(.*\)/agat: "\1"/' > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
102
src/agat/agat_convert_sp_gff2gtf/help.txt
Normal file
102
src/agat/agat_convert_sp_gff2gtf/help.txt
Normal file
@@ -0,0 +1,102 @@
|
||||
```sh
|
||||
agat_convert_sp_gff2gtf.pl --help
|
||||
```
|
||||
------------------------------------------------------------------------------
|
||||
| Another GFF Analysis Toolkit (AGAT) - Version: v1.4.0 |
|
||||
| https://github.com/NBISweden/AGAT |
|
||||
| National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se |
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
Name:
|
||||
agat_convert_sp_gff2gtf.pl
|
||||
|
||||
Description:
|
||||
The script aims to convert any GTF/GFF file into a proper GTF file. Full
|
||||
information about the format can be found here:
|
||||
https://agat.readthedocs.io/en/latest/gxf.html You can choose among 7
|
||||
different GTF types (1, 2, 2.1, 2.2, 2.5, 3 or relax). Depending the
|
||||
version selected the script will filter out the features that are not
|
||||
accepted. For GTF2.5 and 3, every level1 feature (e.g nc_gene
|
||||
pseudogene) will be converted into gene feature and every level2 feature
|
||||
(e.g mRNA ncRNA) will be converted into transcript feature. Using the
|
||||
"relax" option you will produce a GTF-like output keeping all original
|
||||
feature types (3rd column). No modification will occur e.g. mRNA to
|
||||
transcript.
|
||||
|
||||
To be fully GTF compliant all feature have a gene_id and a transcript_id
|
||||
attribute. The gene_id is unique identifier for the genomic source of
|
||||
the transcript, which is used to group transcripts into genes. The
|
||||
transcript_id is a unique identifier for the predicted transcript, which
|
||||
is used to group features into transcripts.
|
||||
|
||||
Usage:
|
||||
agat_convert_sp_gff2gtf.pl --gff infile.gff [ -o outfile ]
|
||||
agat_convert_sp_gff2gtf -h
|
||||
|
||||
Options:
|
||||
--gff, --gtf or -i
|
||||
Input GFF/GTF file that will be read
|
||||
|
||||
--gtf_version version of the GTF output (1,2,2.1,2.2,2.5,3 or relax).
|
||||
Default value from AGAT config file (relax for the default config). The
|
||||
script option has the higher priority.
|
||||
relax: all feature types are accepted.
|
||||
|
||||
GTF3 (9 feature types accepted): gene, transcript, exon, CDS,
|
||||
Selenocysteine, start_codon, stop_codon, three_prime_utr and
|
||||
five_prime_utr
|
||||
|
||||
GTF2.5 (8 feature types accepted): gene, transcript, exon, CDS,
|
||||
UTR, start_codon, stop_codon, Selenocysteine
|
||||
|
||||
GTF2.2 (9 feature types accepted): CDS, start_codon, stop_codon,
|
||||
5UTR, 3UTR, inter, inter_CNS, intron_CNS and exon
|
||||
|
||||
GTF2.1 (6 feature types accepted): CDS, start_codon, stop_codon,
|
||||
exon, 5UTR, 3UTR
|
||||
|
||||
GTF2 (4 feature types accepted): CDS, start_codon, stop_codon,
|
||||
exon
|
||||
|
||||
GTF1 (5 feature types accepted): CDS, start_codon, stop_codon,
|
||||
exon, intron
|
||||
|
||||
-o , --output , --out , --outfile or --gtf
|
||||
Output GTF file. If no output file is specified, the output will
|
||||
be written to STDOUT.
|
||||
|
||||
-c or --config
|
||||
String - Input agat config file. By default AGAT takes as input
|
||||
agat_config.yaml file from the working directory if any,
|
||||
otherwise it takes the orignal agat_config.yaml shipped with
|
||||
AGAT. To get the agat_config.yaml locally type: "agat config
|
||||
--expose". The --config option gives you the possibility to use
|
||||
your own AGAT config file (located elsewhere or named
|
||||
differently).
|
||||
|
||||
-h or --help
|
||||
Display this helpful text.
|
||||
|
||||
Feedback:
|
||||
Did you find a bug?:
|
||||
Do not hesitate to report bugs to help us keep track of the bugs and
|
||||
their resolution. Please use the GitHub issue tracking system available
|
||||
at this address:
|
||||
|
||||
https://github.com/NBISweden/AGAT/issues
|
||||
|
||||
Ensure that the bug was not already reported by searching under Issues.
|
||||
If you're unable to find an (open) issue addressing the problem, open a new one.
|
||||
Try as much as possible to include in the issue when relevant:
|
||||
- a clear description,
|
||||
- as much relevant information as possible,
|
||||
- the command used,
|
||||
- a data sample,
|
||||
- an explanation of the expected behaviour that is not occurring.
|
||||
|
||||
Do you want to contribute?:
|
||||
You are very welcome, visit this address for the Contributing
|
||||
guidelines:
|
||||
https://github.com/NBISweden/AGAT/blob/master/CONTRIBUTING.md
|
||||
|
||||
10
src/agat/agat_convert_sp_gff2gtf/script.sh
Normal file
10
src/agat/agat_convert_sp_gff2gtf/script.sh
Normal file
@@ -0,0 +1,10 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
agat_convert_sp_gff2gtf.pl \
|
||||
-i "$par_gff" \
|
||||
-o "$par_output" \
|
||||
${par_gtf_version:+--gtf_version "${par_gtf_version}"} \
|
||||
${par_config:+--config "${par_config}"}
|
||||
37
src/agat/agat_convert_sp_gff2gtf/test.sh
Normal file
37
src/agat/agat_convert_sp_gff2gtf/test.sh
Normal file
@@ -0,0 +1,37 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
test_dir="${meta_resources_dir}/test_data"
|
||||
|
||||
echo "> Run $meta_name with test data"
|
||||
"$meta_executable" \
|
||||
--gff "$test_dir/0_test.gff" \
|
||||
--output "output.gtf"
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "output.gtf" ] && echo "Output file output.gtf does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "output.gtf" ] && echo "Output file output.gtf is empty" && exit 1
|
||||
|
||||
echo ">> Check if the conversion resulted in the right GTF format"
|
||||
idGFF=$(head -n 2 "$test_dir/0_test.gff" | grep -o 'ID=[^;]*' | cut -d '=' -f 2-)
|
||||
expectedGTF="gene_id \"$idGFF\"; ID \"$idGFF\";"
|
||||
extractedGTF=$(head -n 3 "output.gtf" | grep -o 'gene_id "[^"]*"; ID "[^"]*";')
|
||||
[ "$extractedGTF" != "$expectedGTF" ] && echo "Output file output.gtf does not have the right format" && exit 1
|
||||
|
||||
rm output.gtf
|
||||
|
||||
echo "> Run $meta_name with test data and GTF version 2.5"
|
||||
"$meta_executable" \
|
||||
--gff "$test_dir/0_test.gff" \
|
||||
--output "output.gtf" \
|
||||
--gtf_version "2.5"
|
||||
|
||||
echo ">> Check if the output file header display the right GTF version"
|
||||
grep -q "##gtf-version 2.5" "output.gtf"
|
||||
[ $? -ne 0 ] && echo "Output file output.gtf header does not display the right GTF version" && exit 1
|
||||
|
||||
echo "> Test successful"
|
||||
36
src/agat/agat_convert_sp_gff2gtf/test_data/0_test.gff
Normal file
36
src/agat/agat_convert_sp_gff2gtf/test_data/0_test.gff
Normal file
@@ -0,0 +1,36 @@
|
||||
##gff-version 3
|
||||
scaffold625 maker gene 337818 343277 . + . ID=CLUHARG00000005458;Name=TUBB3_2
|
||||
scaffold625 maker mRNA 337818 343277 . + . ID=CLUHART00000008717;Parent=CLUHARG00000005458
|
||||
scaffold625 maker exon 337818 337971 . + . ID=CLUHART00000008717:exon:1404;Parent=CLUHART00000008717
|
||||
scaffold625 maker exon 340733 340841 . + . ID=CLUHART00000008717:exon:1405;Parent=CLUHART00000008717
|
||||
scaffold625 maker exon 341518 341628 . + . ID=CLUHART00000008717:exon:1406;Parent=CLUHART00000008717
|
||||
scaffold625 maker exon 341964 343277 . + . ID=CLUHART00000008717:exon:1407;Parent=CLUHART00000008717
|
||||
scaffold625 maker CDS 337915 337971 . + 0 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
|
||||
scaffold625 maker CDS 340733 340841 . + 0 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
|
||||
scaffold625 maker CDS 341518 341628 . + 2 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
|
||||
scaffold625 maker CDS 341964 343033 . + 2 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
|
||||
scaffold625 maker five_prime_UTR 337818 337914 . + . ID=CLUHART00000008717:five_prime_utr;Parent=CLUHART00000008717
|
||||
scaffold625 maker three_prime_UTR 343034 343277 . + . ID=CLUHART00000008717:three_prime_utr;Parent=CLUHART00000008717
|
||||
scaffold789 maker gene 558184 564780 . + . ID=CLUHARG00000003852;Name=PF11_0240
|
||||
scaffold789 maker mRNA 558184 564780 . + . ID=CLUHART00000006146;Parent=CLUHARG00000003852
|
||||
scaffold789 maker exon 558184 560123 . + . ID=CLUHART00000006146:exon:995;Parent=CLUHART00000006146
|
||||
scaffold789 maker exon 561401 561519 . + . ID=CLUHART00000006146:exon:996;Parent=CLUHART00000006146
|
||||
scaffold789 maker exon 564171 564235 . + . ID=CLUHART00000006146:exon:997;Parent=CLUHART00000006146
|
||||
scaffold789 maker exon 564372 564780 . + . ID=CLUHART00000006146:exon:998;Parent=CLUHART00000006146
|
||||
scaffold789 maker CDS 558191 560123 . + 0 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146
|
||||
scaffold789 maker CDS 561401 561519 . + 2 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146
|
||||
scaffold789 maker CDS 564171 564235 . + 0 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146
|
||||
scaffold789 maker CDS 564372 564588 . + 1 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146
|
||||
scaffold789 maker five_prime_UTR 558184 558190 . + . ID=CLUHART00000006146:five_prime_utr;Parent=CLUHART00000006146
|
||||
scaffold789 maker three_prime_UTR 564589 564780 . + . ID=CLUHART00000006146:three_prime_utr;Parent=CLUHART00000006146
|
||||
scaffold789 maker mRNA 558184 564780 . + . ID=CLUHART00000006147;Parent=CLUHARG00000003852
|
||||
scaffold789 maker exon 558184 560123 . + . ID=CLUHART00000006147:exon:997;Parent=CLUHART00000006147
|
||||
scaffold789 maker exon 561401 561519 . + . ID=CLUHART00000006147:exon:998;Parent=CLUHART00000006147
|
||||
scaffold789 maker exon 562057 562121 . + . ID=CLUHART00000006147:exon:999;Parent=CLUHART00000006147
|
||||
scaffold789 maker exon 564372 564780 . + . ID=CLUHART00000006147:exon:1000;Parent=CLUHART00000006147
|
||||
scaffold789 maker CDS 558191 560123 . + 0 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147
|
||||
scaffold789 maker CDS 561401 561519 . + 2 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147
|
||||
scaffold789 maker CDS 562057 562121 . + 0 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147
|
||||
scaffold789 maker CDS 564372 564588 . + 1 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147
|
||||
scaffold789 maker five_prime_UTR 558184 558190 . + . ID=CLUHART00000006147:five_prime_utr;Parent=CLUHART00000006147
|
||||
scaffold789 maker three_prime_UTR 564589 564780 . + . ID=CLUHART00000006147:three_prime_utr;Parent=CLUHART00000006147
|
||||
9
src/agat/agat_convert_sp_gff2gtf/test_data/script.sh
Executable file
9
src/agat/agat_convert_sp_gff2gtf/test_data/script.sh
Executable file
@@ -0,0 +1,9 @@
|
||||
#!/bin/bash
|
||||
|
||||
# clone repo
|
||||
if [ ! -d /tmp/agat_source ]; then
|
||||
git clone --depth 1 --single-branch --branch master https://github.com/NBISweden/AGAT /tmp/agat_source
|
||||
fi
|
||||
|
||||
# copy test data
|
||||
cp -r /tmp/agat_source/t/gff_syntax/in/0_test.gff src/agat/agat_convert_sp_gff2gtf/test_data
|
||||
70
src/agat/agat_convert_sp_gff2tsv/config.vsh.yaml
Normal file
70
src/agat/agat_convert_sp_gff2tsv/config.vsh.yaml
Normal file
@@ -0,0 +1,70 @@
|
||||
name: agat_convert_sp_gff2tsv
|
||||
namespace: agat
|
||||
description: |
|
||||
The script aims to convert gtf/gff file into tabulated file. Attribute's
|
||||
tags from the 9th column become column titles.
|
||||
keywords: [gene annotations, GFF conversion]
|
||||
links:
|
||||
homepage: https://github.com/NBISweden/AGAT
|
||||
documentation: https://agat.readthedocs.io/en/latest/tools/agat_convert_sp_gff2tsv.html
|
||||
issue_tracker: https://github.com/NBISweden/AGAT/issues
|
||||
repository: https://github.com/NBISweden/AGAT
|
||||
references:
|
||||
doi: 10.5281/zenodo.3552717
|
||||
license: GPL-3.0
|
||||
authors:
|
||||
- __merge__: /src/_authors/leila_paquay.yaml
|
||||
roles: [ author, maintainer ]
|
||||
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments:
|
||||
- name: --gff
|
||||
alternatives: [-f]
|
||||
description: Input GTF/GFF file.
|
||||
type: file
|
||||
required: true
|
||||
direction: input
|
||||
example: input.gff
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- name: --output
|
||||
alternatives: [-o, --out, --outfile]
|
||||
description: Output GFF file. If no output file is specified, the output will be written to STDOUT.
|
||||
type: file
|
||||
direction: output
|
||||
required: true
|
||||
example: output.gff
|
||||
- name: Arguments
|
||||
arguments:
|
||||
- name: --config
|
||||
alternatives: [-c]
|
||||
description: |
|
||||
String - Input agat config file. By default AGAT takes as input
|
||||
agat_config.yaml file from the working directory if any,
|
||||
otherwise it takes the orignal agat_config.yaml shipped with
|
||||
AGAT. To get the agat_config.yaml locally type: "agat config
|
||||
--expose". The --config option gives you the possibility to use
|
||||
your own AGAT config file (located elsewhere or named
|
||||
differently).
|
||||
type: file
|
||||
required: false
|
||||
example: custom_agat_config.yaml
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- type: file
|
||||
path: test_data
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/agat:1.4.0--pl5321hdfd78af_0
|
||||
setup:
|
||||
- type: docker
|
||||
run: |
|
||||
agat --version | sed 's/AGAT\s\(.*\)/agat: "\1"/' > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
63
src/agat/agat_convert_sp_gff2tsv/help.txt
Normal file
63
src/agat/agat_convert_sp_gff2tsv/help.txt
Normal file
@@ -0,0 +1,63 @@
|
||||
```sh
|
||||
agat_convert_sp_gff2tsv.pl --help
|
||||
```
|
||||
|
||||
------------------------------------------------------------------------------
|
||||
| Another GFF Analysis Toolkit (AGAT) - Version: v1.4.0 |
|
||||
| https://github.com/NBISweden/AGAT |
|
||||
| National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se |
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
Name:
|
||||
agat_convert_sp_gff2tsv.pl
|
||||
|
||||
Description:
|
||||
The script aims to convert gtf/gff file into tabulated file. Attribute's
|
||||
tags from the 9th column become column titles.
|
||||
|
||||
Usage:
|
||||
agat_convert_sp_gff2tsv.pl -gff file.gff [ -o outfile ]
|
||||
agat_convert_sp_gff2tsv.pl --help
|
||||
|
||||
Options:
|
||||
--gff or -f
|
||||
Input GTF/GFF file.
|
||||
|
||||
-o , --output , --out or --outfile
|
||||
Output GFF file. If no output file is specified, the output will
|
||||
be written to STDOUT.
|
||||
|
||||
-c or --config
|
||||
String - Input agat config file. By default AGAT takes as input
|
||||
agat_config.yaml file from the working directory if any,
|
||||
otherwise it takes the orignal agat_config.yaml shipped with
|
||||
AGAT. To get the agat_config.yaml locally type: "agat config
|
||||
--expose". The --config option gives you the possibility to use
|
||||
your own AGAT config file (located elsewhere or named
|
||||
differently).
|
||||
|
||||
-h or --help
|
||||
Display this helpful text.
|
||||
|
||||
Feedback:
|
||||
Did you find a bug?:
|
||||
Do not hesitate to report bugs to help us keep track of the bugs and
|
||||
their resolution. Please use the GitHub issue tracking system available
|
||||
at this address:
|
||||
|
||||
https://github.com/NBISweden/AGAT/issues
|
||||
|
||||
Ensure that the bug was not already reported by searching under Issues.
|
||||
If you're unable to find an (open) issue addressing the problem, open a new one.
|
||||
Try as much as possible to include in the issue when relevant:
|
||||
- a clear description,
|
||||
- as much relevant information as possible,
|
||||
- the command used,
|
||||
- a data sample,
|
||||
- an explanation of the expected behaviour that is not occurring.
|
||||
|
||||
Do you want to contribute?:
|
||||
You are very welcome, visit this address for the Contributing
|
||||
guidelines:
|
||||
https://github.com/NBISweden/AGAT/blob/master/CONTRIBUTING.md
|
||||
9
src/agat/agat_convert_sp_gff2tsv/script.sh
Normal file
9
src/agat/agat_convert_sp_gff2tsv/script.sh
Normal file
@@ -0,0 +1,9 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
agat_convert_sp_gff2tsv.pl \
|
||||
-f "$par_gff" \
|
||||
-o "$par_output" \
|
||||
${par_config:+--config "${par_config}"}
|
||||
27
src/agat/agat_convert_sp_gff2tsv/test.sh
Normal file
27
src/agat/agat_convert_sp_gff2tsv/test.sh
Normal file
@@ -0,0 +1,27 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
test_dir="${meta_resources_dir}/test_data"
|
||||
out_dir="${meta_resources_dir}/out_data"
|
||||
|
||||
echo "> Run $meta_name with test data"
|
||||
"$meta_executable" \
|
||||
--gff "$test_dir/1.gff" \
|
||||
--output "$out_dir/output.gff"
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "$out_dir/output.gff" ] && echo "Output file output.gff does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "$out_dir/output.gff" ] && echo "Output file output.gff is empty" && exit 1
|
||||
|
||||
echo ">> Check if output matches expected output"
|
||||
diff "$out_dir/output.gff" "$test_dir/agat_convert_sp_gff2tsv_1.tsv"
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "Output file output.gff does not match expected output"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "> Test successful"
|
||||
942
src/agat/agat_convert_sp_gff2tsv/test_data/1.gff
Normal file
942
src/agat/agat_convert_sp_gff2tsv/test_data/1.gff
Normal file
@@ -0,0 +1,942 @@
|
||||
##gff-version 3
|
||||
##sequence-region 1 1 43270923
|
||||
#!genome-build RAP-DB IRGSP-1.0
|
||||
#!genome-version IRGSP-1.0
|
||||
#!genome-date 2015-10
|
||||
#!genome-build-accession GCA_001433935.1
|
||||
1 RAP-DB chromosome 1 43270923 . . . ID=chromosome:1;Alias=Chr1,AP014957.1,NC_029256.1
|
||||
###
|
||||
1 irgsp repeat_region 2000 2100 . + . ID=fakeRepeat1
|
||||
###
|
||||
1 irgsp gene 2983 10815 . + . ID=gene:Os01g0100100;biotype=protein_coding;description=RabGAP/TBC domain containing protein. (Os01t0100100-01);gene_id=Os01g0100100;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 2983 10815 . + . ID=transcript:Os01t0100100-01;Parent=gene:Os01g0100100;biotype=protein_coding;transcript_id=Os01t0100100-01
|
||||
1 irgsp exon 2983 3268 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0100100-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 2983 3268 . + . Parent=transcript:Os01t0100100-01
|
||||
1 irgsp five_prime_UTR 3354 3448 . + . Parent=transcript:Os01t0100100-01
|
||||
1 irgsp exon 3354 3616 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0100100-01.exon2;rank=2
|
||||
1 irgsp CDS 3449 3616 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp exon 4357 4455 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100100-01.exon3;rank=3
|
||||
1 irgsp CDS 4357 4455 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp exon 5457 5560 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon4;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0100100-01.exon4;rank=4
|
||||
1 irgsp CDS 5457 5560 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp exon 7136 7944 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon5;constitutive=1;ensembl_end_phase=1;ensembl_phase=2;exon_id=Os01t0100100-01.exon5;rank=5
|
||||
1 irgsp CDS 7136 7944 . + 1 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp exon 8028 8150 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon6;constitutive=1;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0100100-01.exon6;rank=6
|
||||
1 irgsp CDS 8028 8150 . + 2 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp exon 8232 8320 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0100100-01.exon7;rank=7
|
||||
1 irgsp CDS 8232 8320 . + 2 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp exon 8408 8608 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon8;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100100-01.exon8;rank=8
|
||||
1 irgsp CDS 8408 8608 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp exon 9210 9615 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon9;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0100100-01.exon9;rank=9
|
||||
1 irgsp CDS 9210 9615 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp exon 10102 10187 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon10;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0100100-01.exon10;rank=10
|
||||
1 irgsp CDS 10102 10187 . + 2 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp CDS 10274 10297 . + 0 ID=CDS:Os01t0100100-01;Parent=transcript:Os01t0100100-01;protein_id=Os01t0100100-01
|
||||
1 irgsp exon 10274 10430 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon11;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0100100-01.exon11;rank=11
|
||||
1 irgsp three_prime_UTR 10298 10430 . + . Parent=transcript:Os01t0100100-01
|
||||
1 irgsp exon 10504 10815 . + . Parent=transcript:Os01t0100100-01;Name=Os01t0100100-01.exon12;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0100100-01.exon12;rank=12
|
||||
1 irgsp three_prime_UTR 10504 10815 . + . Parent=transcript:Os01t0100100-01
|
||||
###
|
||||
1 irgsp gene 11218 12435 . + . ID=gene:Os01g0100200;biotype=protein_coding;description=Conserved hypothetical protein. (Os01t0100200-01);gene_id=Os01g0100200;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 11218 12435 . + . ID=transcript:Os01t0100200-01;Parent=gene:Os01g0100200;biotype=protein_coding;transcript_id=Os01t0100200-01
|
||||
1 irgsp five_prime_UTR 11218 11797 . + . Parent=transcript:Os01t0100200-01
|
||||
1 irgsp exon 11218 12060 . + . Parent=transcript:Os01t0100200-01;Name=Os01t0100200-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0100200-01.exon1;rank=1
|
||||
1 irgsp CDS 11798 12060 . + 0 ID=CDS:Os01t0100200-01;Parent=transcript:Os01t0100200-01;protein_id=Os01t0100200-01
|
||||
1 irgsp CDS 12152 12317 . + 1 ID=CDS:Os01t0100200-01;Parent=transcript:Os01t0100200-01;protein_id=Os01t0100200-01
|
||||
1 irgsp exon 12152 12435 . + . Parent=transcript:Os01t0100200-01;Name=Os01t0100200-01.exon2;constitutive=1;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0100200-01.exon2;rank=2
|
||||
1 irgsp three_prime_UTR 12318 12435 . + . Parent=transcript:Os01t0100200-01
|
||||
###
|
||||
1 irgsp gene 11372 12284 . - . ID=gene:Os01g0100300;biotype=protein_coding;description=Cytochrome P450 domain containing protein. (Os01t0100300-00);gene_id=Os01g0100300;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 11372 12284 . - . ID=transcript:Os01t0100300-00;Parent=gene:Os01g0100300;biotype=protein_coding;transcript_id=Os01t0100300-00
|
||||
1 irgsp exon 11372 12042 . - . Parent=transcript:Os01t0100300-00;Name=Os01t0100300-00.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0100300-00.exon2;rank=2
|
||||
1 irgsp CDS 11372 12042 . - 2 ID=CDS:Os01t0100300-00;Parent=transcript:Os01t0100300-00;protein_id=Os01t0100300-00
|
||||
1 irgsp exon 12146 12284 . - . Parent=transcript:Os01t0100300-00;Name=Os01t0100300-00.exon1;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0100300-00.exon1;rank=1
|
||||
1 irgsp CDS 12146 12284 . - 0 ID=CDS:Os01t0100300-00;Parent=transcript:Os01t0100300-00;protein_id=Os01t0100300-00
|
||||
###
|
||||
1 irgsp gene 12721 15685 . + . ID=gene:Os01g0100400;biotype=protein_coding;description=Similar to Pectinesterase-like protein. (Os01t0100400-01);gene_id=Os01g0100400;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 12721 15685 . + . ID=transcript:Os01t0100400-01;Parent=gene:Os01g0100400;biotype=protein_coding;transcript_id=Os01t0100400-01
|
||||
1 irgsp five_prime_UTR 12721 12773 . + . Parent=transcript:Os01t0100400-01
|
||||
1 irgsp exon 12721 13813 . + . Parent=transcript:Os01t0100400-01;Name=Os01t0100400-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0100400-01.exon1;rank=1
|
||||
1 irgsp CDS 12774 13813 . + 0 ID=CDS:Os01t0100400-01;Parent=transcript:Os01t0100400-01;protein_id=Os01t0100400-01
|
||||
1 irgsp exon 13906 14271 . + . Parent=transcript:Os01t0100400-01;Name=Os01t0100400-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=2;exon_id=Os01t0100400-01.exon2;rank=2
|
||||
1 irgsp CDS 13906 14271 . + 1 ID=CDS:Os01t0100400-01;Parent=transcript:Os01t0100400-01;protein_id=Os01t0100400-01
|
||||
1 irgsp exon 14359 14437 . + . Parent=transcript:Os01t0100400-01;Name=Os01t0100400-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0100400-01.exon3;rank=3
|
||||
1 irgsp CDS 14359 14437 . + 1 ID=CDS:Os01t0100400-01;Parent=transcript:Os01t0100400-01;protein_id=Os01t0100400-01
|
||||
1 irgsp exon 14969 15171 . + . Parent=transcript:Os01t0100400-01;Name=Os01t0100400-01.exon4;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0100400-01.exon4;rank=4
|
||||
1 irgsp CDS 14969 15171 . + 0 ID=CDS:Os01t0100400-01;Parent=transcript:Os01t0100400-01;protein_id=Os01t0100400-01
|
||||
1 irgsp CDS 15266 15359 . + 1 ID=CDS:Os01t0100400-01;Parent=transcript:Os01t0100400-01;protein_id=Os01t0100400-01
|
||||
1 irgsp exon 15266 15685 . + . Parent=transcript:Os01t0100400-01;Name=Os01t0100400-01.exon5;constitutive=1;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0100400-01.exon5;rank=5
|
||||
1 irgsp three_prime_UTR 15360 15685 . + . Parent=transcript:Os01t0100400-01
|
||||
###
|
||||
1 irgsp gene 12808 13978 . - . ID=gene:Os01g0100466;biotype=protein_coding;description=Hypothetical protein. (Os01t0100466-00);gene_id=Os01g0100466;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 12808 13978 . - . ID=transcript:Os01t0100466-00;Parent=gene:Os01g0100466;biotype=protein_coding;transcript_id=Os01t0100466-00
|
||||
1 irgsp three_prime_UTR 12808 12868 . - . Parent=transcript:Os01t0100466-00
|
||||
1 irgsp exon 12808 13782 . - . Parent=transcript:Os01t0100466-00;Name=Os01t0100466-00.exon2;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0100466-00.exon2;rank=2
|
||||
1 irgsp CDS 12869 13102 . - 0 ID=CDS:Os01t0100466-00;Parent=transcript:Os01t0100466-00;protein_id=Os01t0100466-00
|
||||
1 irgsp five_prime_UTR 13103 13782 . - . Parent=transcript:Os01t0100466-00
|
||||
1 irgsp exon 13880 13978 . - . Parent=transcript:Os01t0100466-00;Name=Os01t0100466-00.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0100466-00.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 13880 13978 . - . Parent=transcript:Os01t0100466-00
|
||||
###
|
||||
1 irgsp gene 16399 20144 . + . ID=gene:Os01g0100500;biotype=protein_coding;description=Immunoglobulin-like domain containing protein. (Os01t0100500-01);gene_id=Os01g0100500;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 16399 20144 . + . ID=transcript:Os01t0100500-01;Parent=gene:Os01g0100500;biotype=protein_coding;transcript_id=Os01t0100500-01
|
||||
1 irgsp five_prime_UTR 16399 16598 . + . Parent=transcript:Os01t0100500-01
|
||||
1 irgsp exon 16399 16976 . + . Parent=transcript:Os01t0100500-01;Name=Os01t0100500-01.exon1;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0100500-01.exon1;rank=1
|
||||
1 irgsp CDS 16599 16976 . + 0 ID=CDS:Os01t0100500-01;Parent=transcript:Os01t0100500-01;protein_id=Os01t0100500-01
|
||||
1 irgsp exon 17383 17474 . + . Parent=transcript:Os01t0100500-01;Name=Os01t0100500-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0100500-01.exon2;rank=2
|
||||
1 irgsp CDS 17383 17474 . + 0 ID=CDS:Os01t0100500-01;Parent=transcript:Os01t0100500-01;protein_id=Os01t0100500-01
|
||||
1 irgsp exon 17558 18258 . + . Parent=transcript:Os01t0100500-01;Name=Os01t0100500-01.exon3;constitutive=1;ensembl_end_phase=1;ensembl_phase=2;exon_id=Os01t0100500-01.exon3;rank=3
|
||||
1 irgsp CDS 17558 18258 . + 1 ID=CDS:Os01t0100500-01;Parent=transcript:Os01t0100500-01;protein_id=Os01t0100500-01
|
||||
1 irgsp exon 18501 18571 . + . Parent=transcript:Os01t0100500-01;Name=Os01t0100500-01.exon4;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0100500-01.exon4;rank=4
|
||||
1 irgsp CDS 18501 18571 . + 2 ID=CDS:Os01t0100500-01;Parent=transcript:Os01t0100500-01;protein_id=Os01t0100500-01
|
||||
1 irgsp exon 18968 19057 . + . Parent=transcript:Os01t0100500-01;Name=Os01t0100500-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100500-01.exon5;rank=5
|
||||
1 irgsp CDS 18968 19057 . + 0 ID=CDS:Os01t0100500-01;Parent=transcript:Os01t0100500-01;protein_id=Os01t0100500-01
|
||||
1 irgsp exon 19142 19321 . + . Parent=transcript:Os01t0100500-01;Name=Os01t0100500-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100500-01.exon6;rank=6
|
||||
1 irgsp CDS 19142 19321 . + 0 ID=CDS:Os01t0100500-01;Parent=transcript:Os01t0100500-01;protein_id=Os01t0100500-01
|
||||
1 irgsp CDS 19531 19593 . + 0 ID=CDS:Os01t0100500-01;Parent=transcript:Os01t0100500-01;protein_id=Os01t0100500-01
|
||||
1 irgsp exon 19531 19629 . + . Parent=transcript:Os01t0100500-01;Name=Os01t0100500-01.exon7;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0100500-01.exon7;rank=7
|
||||
1 irgsp three_prime_UTR 19594 19629 . + . Parent=transcript:Os01t0100500-01
|
||||
1 irgsp exon 19734 20144 . + . Parent=transcript:Os01t0100500-01;Name=Os01t0100500-01.exon8;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0100500-01.exon8;rank=8
|
||||
1 irgsp three_prime_UTR 19734 20144 . + . Parent=transcript:Os01t0100500-01
|
||||
###
|
||||
1 irgsp gene 22841 26892 . + . ID=gene:Os01g0100600;biotype=protein_coding;description=Single-stranded nucleic acid binding R3H domain containing protein. (Os01t0100600-01);gene_id=Os01g0100600;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 22841 26892 . + . ID=transcript:Os01t0100600-01;Parent=gene:Os01g0100600;biotype=protein_coding;transcript_id=Os01t0100600-01
|
||||
1 irgsp five_prime_UTR 22841 23231 . + . Parent=transcript:Os01t0100600-01
|
||||
1 irgsp exon 22841 23281 . + . Parent=transcript:Os01t0100600-01;Name=Os01t0100600-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0100600-01.exon1;rank=1
|
||||
1 irgsp CDS 23232 23281 . + 0 ID=CDS:Os01t0100600-01;Parent=transcript:Os01t0100600-01;protein_id=Os01t0100600-01
|
||||
1 irgsp exon 23572 23847 . + . Parent=transcript:Os01t0100600-01;Name=Os01t0100600-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=2;exon_id=Os01t0100600-01.exon2;rank=2
|
||||
1 irgsp CDS 23572 23847 . + 1 ID=CDS:Os01t0100600-01;Parent=transcript:Os01t0100600-01;protein_id=Os01t0100600-01
|
||||
1 irgsp exon 23962 24033 . + . Parent=transcript:Os01t0100600-01;Name=Os01t0100600-01.exon3;constitutive=1;ensembl_end_phase=2;ensembl_phase=2;exon_id=Os01t0100600-01.exon3;rank=3
|
||||
1 irgsp CDS 23962 24033 . + 1 ID=CDS:Os01t0100600-01;Parent=transcript:Os01t0100600-01;protein_id=Os01t0100600-01
|
||||
1 irgsp exon 24492 24577 . + . Parent=transcript:Os01t0100600-01;Name=Os01t0100600-01.exon4;constitutive=1;ensembl_end_phase=1;ensembl_phase=2;exon_id=Os01t0100600-01.exon4;rank=4
|
||||
1 irgsp CDS 24492 24577 . + 1 ID=CDS:Os01t0100600-01;Parent=transcript:Os01t0100600-01;protein_id=Os01t0100600-01
|
||||
1 irgsp exon 25445 25519 . + . Parent=transcript:Os01t0100600-01;Name=Os01t0100600-01.exon5;constitutive=1;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0100600-01.exon5;rank=5
|
||||
1 irgsp CDS 25445 25519 . + 2 ID=CDS:Os01t0100600-01;Parent=transcript:Os01t0100600-01;protein_id=Os01t0100600-01
|
||||
1 irgsp CDS 25883 26391 . + 2 ID=CDS:Os01t0100600-01;Parent=transcript:Os01t0100600-01;protein_id=Os01t0100600-01
|
||||
1 irgsp exon 25883 26892 . + . Parent=transcript:Os01t0100600-01;Name=Os01t0100600-01.exon6;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0100600-01.exon6;rank=6
|
||||
1 irgsp three_prime_UTR 26392 26892 . + . Parent=transcript:Os01t0100600-01
|
||||
###
|
||||
1 irgsp gene 25861 26424 . - . ID=gene:Os01g0100650;biotype=protein_coding;description=Hypothetical gene. (Os01t0100650-00);gene_id=Os01g0100650;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 25861 26424 . - . ID=transcript:Os01t0100650-00;Parent=gene:Os01g0100650;biotype=protein_coding;transcript_id=Os01t0100650-00
|
||||
1 irgsp three_prime_UTR 25861 26039 . - . Parent=transcript:Os01t0100650-00
|
||||
1 irgsp exon 25861 26424 . - . Parent=transcript:Os01t0100650-00;Name=Os01t0100650-00.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0100650-00.exon1;rank=1
|
||||
1 irgsp CDS 26040 26423 . - 0 ID=CDS:Os01t0100650-00;Parent=transcript:Os01t0100650-00;protein_id=Os01t0100650-00
|
||||
1 irgsp five_prime_UTR 26424 26424 . - . Parent=transcript:Os01t0100650-00
|
||||
###
|
||||
1 irgsp gene 27143 28644 . + . ID=gene:Os01g0100700;biotype=protein_coding;description=Similar to 40S ribosomal protein S5-1. (Os01t0100700-01);gene_id=Os01g0100700;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 27143 28644 . + . ID=transcript:Os01t0100700-01;Parent=gene:Os01g0100700;biotype=protein_coding;transcript_id=Os01t0100700-01
|
||||
1 irgsp five_prime_UTR 27143 27220 . + . Parent=transcript:Os01t0100700-01
|
||||
1 irgsp exon 27143 27292 . + . Parent=transcript:Os01t0100700-01;Name=Os01t0100700-01.exon1;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0100700-01.exon1;rank=1
|
||||
1 irgsp CDS 27221 27292 . + 0 ID=CDS:Os01t0100700-01;Parent=transcript:Os01t0100700-01;protein_id=Os01t0100700-01
|
||||
1 irgsp exon 27370 27641 . + . Parent=transcript:Os01t0100700-01;Name=Os01t0100700-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0100700-01.exon2;rank=2
|
||||
1 irgsp CDS 27370 27641 . + 0 ID=CDS:Os01t0100700-01;Parent=transcript:Os01t0100700-01;protein_id=Os01t0100700-01
|
||||
1 irgsp exon 28090 28293 . + . Parent=transcript:Os01t0100700-01;Name=Os01t0100700-01.exon3;constitutive=1;ensembl_end_phase=2;ensembl_phase=2;exon_id=Os01t0100700-01.exon3;rank=3
|
||||
1 irgsp CDS 28090 28293 . + 1 ID=CDS:Os01t0100700-01;Parent=transcript:Os01t0100700-01;protein_id=Os01t0100700-01
|
||||
1 irgsp CDS 28365 28419 . + 1 ID=CDS:Os01t0100700-01;Parent=transcript:Os01t0100700-01;protein_id=Os01t0100700-01
|
||||
1 irgsp exon 28365 28644 . + . Parent=transcript:Os01t0100700-01;Name=Os01t0100700-01.exon4;constitutive=1;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0100700-01.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 28420 28644 . + . Parent=transcript:Os01t0100700-01
|
||||
###
|
||||
1 irgsp gene 29818 34453 . + . ID=gene:Os01g0100800;biotype=protein_coding;description=Protein of unknown function DUF1664 family protein. (Os01t0100800-01);gene_id=Os01g0100800;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 29818 34453 . + . ID=transcript:Os01t0100800-01;Parent=gene:Os01g0100800;biotype=protein_coding;transcript_id=Os01t0100800-01
|
||||
1 irgsp five_prime_UTR 29818 29939 . + . Parent=transcript:Os01t0100800-01
|
||||
1 irgsp exon 29818 29976 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon1;constitutive=1;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0100800-01.exon1;rank=1
|
||||
1 irgsp CDS 29940 29976 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 30146 30228 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0100800-01.exon2;rank=2
|
||||
1 irgsp CDS 30146 30228 . + 2 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 30735 30806 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100800-01.exon3;rank=3
|
||||
1 irgsp CDS 30735 30806 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 30885 30963 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon4;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0100800-01.exon4;rank=4
|
||||
1 irgsp CDS 30885 30963 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 31258 31325 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0100800-01.exon5;rank=5
|
||||
1 irgsp CDS 31258 31325 . + 2 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 31505 31606 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100800-01.exon6;rank=6
|
||||
1 irgsp CDS 31505 31606 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 32377 32466 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100800-01.exon7;rank=7
|
||||
1 irgsp CDS 32377 32466 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 32542 32616 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon8;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100800-01.exon8;rank=8
|
||||
1 irgsp CDS 32542 32616 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 32712 32744 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon9;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100800-01.exon9;rank=9
|
||||
1 irgsp CDS 32712 32744 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 32828 32905 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon10;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100800-01.exon10;rank=10
|
||||
1 irgsp CDS 32828 32905 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 33274 33330 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon11;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100800-01.exon11;rank=11
|
||||
1 irgsp CDS 33274 33330 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 33400 33471 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon12;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100800-01.exon12;rank=12
|
||||
1 irgsp CDS 33400 33471 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 33543 33617 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon13;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100800-01.exon13;rank=13
|
||||
1 irgsp CDS 33543 33617 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp CDS 33975 34124 . + 0 ID=CDS:Os01t0100800-01;Parent=transcript:Os01t0100800-01;protein_id=Os01t0100800-01
|
||||
1 irgsp exon 33975 34453 . + . Parent=transcript:Os01t0100800-01;Name=Os01t0100800-01.exon14;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0100800-01.exon14;rank=14
|
||||
1 irgsp three_prime_UTR 34125 34453 . + . Parent=transcript:Os01t0100800-01
|
||||
###
|
||||
1 irgsp gene 35623 41136 . + . ID=gene:Os01g0100900;Name=SPHINGOSINE-1-PHOSPHATE LYASE 1%2C Sphingosine-1-Phoshpate Lyase 1;biotype=protein_coding;description=Sphingosine-1-phosphate lyase%2C Disease resistance response (Os01t0100900-01);gene_id=Os01g0100900;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 35623 41136 . + . ID=transcript:Os01t0100900-01;Parent=gene:Os01g0100900;biotype=protein_coding;transcript_id=Os01t0100900-01
|
||||
1 irgsp five_prime_UTR 35623 35742 . + . Parent=transcript:Os01t0100900-01
|
||||
1 irgsp exon 35623 35939 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0100900-01.exon1;rank=1
|
||||
1 irgsp CDS 35743 35939 . + 0 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 36027 36072 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0100900-01.exon2;rank=2
|
||||
1 irgsp CDS 36027 36072 . + 1 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 36517 36668 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon3;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0100900-01.exon3;rank=3
|
||||
1 irgsp CDS 36517 36668 . + 0 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 36818 36877 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon4;constitutive=1;ensembl_end_phase=2;ensembl_phase=2;exon_id=Os01t0100900-01.exon4;rank=4
|
||||
1 irgsp CDS 36818 36877 . + 1 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 37594 37818 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon5;constitutive=1;ensembl_end_phase=2;ensembl_phase=2;exon_id=Os01t0100900-01.exon5;rank=5
|
||||
1 irgsp CDS 37594 37818 . + 1 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 37892 38033 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0100900-01.exon6;rank=6
|
||||
1 irgsp CDS 37892 38033 . + 1 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 38276 38326 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0100900-01.exon7;rank=7
|
||||
1 irgsp CDS 38276 38326 . + 0 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 38434 38525 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon8;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0100900-01.exon8;rank=8
|
||||
1 irgsp CDS 38434 38525 . + 0 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 39319 39445 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon9;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0100900-01.exon9;rank=9
|
||||
1 irgsp CDS 39319 39445 . + 1 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 39553 39568 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon10;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0100900-01.exon10;rank=10
|
||||
1 irgsp CDS 39553 39568 . + 0 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 39939 40046 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon11;constitutive=1;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0100900-01.exon11;rank=11
|
||||
1 irgsp CDS 39939 40046 . + 2 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 40135 40189 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon12;constitutive=1;ensembl_end_phase=2;ensembl_phase=1;exon_id=Os01t0100900-01.exon12;rank=12
|
||||
1 irgsp CDS 40135 40189 . + 2 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 40456 40602 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon13;constitutive=1;ensembl_end_phase=2;ensembl_phase=2;exon_id=Os01t0100900-01.exon13;rank=13
|
||||
1 irgsp CDS 40456 40602 . + 1 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 40703 40781 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon14;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0100900-01.exon14;rank=14
|
||||
1 irgsp CDS 40703 40781 . + 1 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp CDS 40885 41007 . + 0 ID=CDS:Os01t0100900-01;Parent=transcript:Os01t0100900-01;protein_id=Os01t0100900-01
|
||||
1 irgsp exon 40885 41136 . + . Parent=transcript:Os01t0100900-01;Name=Os01t0100900-01.exon15;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0100900-01.exon15;rank=15
|
||||
1 irgsp three_prime_UTR 41008 41136 . + . Parent=transcript:Os01t0100900-01
|
||||
###
|
||||
1 irgsp gene 58658 61090 . + . ID=gene:Os01g0101150;biotype=protein_coding;description=Hypothetical conserved gene. (Os01t0101150-00);gene_id=Os01g0101150;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 58658 61090 . + . ID=transcript:Os01t0101150-00;Parent=gene:Os01g0101150;biotype=protein_coding;transcript_id=Os01t0101150-00
|
||||
1 irgsp exon 58658 61090 . + . Parent=transcript:Os01t0101150-00;Name=Os01t0101150-00.exon1;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0101150-00.exon1;rank=1
|
||||
1 irgsp CDS 58658 61090 . + 0 ID=CDS:Os01t0101150-00;Parent=transcript:Os01t0101150-00;protein_id=Os01t0101150-00
|
||||
###
|
||||
1 irgsp gene 62060 65537 . + . ID=gene:Os01g0101200;biotype=protein_coding;description=2%2C3-diketo-5-methylthio-1-phosphopentane phosphatase domain containing protein. (Os01t0101200-01)%3B2%2C3-diketo-5-methylthio-1-phosphopentane phosphatase domain containing protein. (Os01t0101200-02);gene_id=Os01g0101200;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 62060 63576 . + . ID=transcript:Os01t0101200-01;Parent=gene:Os01g0101200;biotype=protein_coding;transcript_id=Os01t0101200-01
|
||||
1 irgsp five_prime_UTR 62060 62103 . + . Parent=transcript:Os01t0101200-01
|
||||
1 irgsp exon 62060 62295 . + . Parent=transcript:Os01t0101200-01;Name=Os01t0101200-01.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0101200-01.exon1;rank=1
|
||||
1 irgsp CDS 62104 62295 . + 0 ID=CDS:Os01t0101200-01;Parent=transcript:Os01t0101200-01;protein_id=Os01t0101200-01
|
||||
1 irgsp exon 62385 62905 . + . Parent=transcript:Os01t0101200-01;Name=Os01t0101200-02.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0101200-02.exon2;rank=2
|
||||
1 irgsp CDS 62385 62905 . + 0 ID=CDS:Os01t0101200-01;Parent=transcript:Os01t0101200-01;protein_id=Os01t0101200-01
|
||||
1 irgsp exon 62996 63114 . + . Parent=transcript:Os01t0101200-01;Name=Os01t0101200-02.exon3;constitutive=1;ensembl_end_phase=1;ensembl_phase=2;exon_id=Os01t0101200-02.exon3;rank=3
|
||||
1 irgsp CDS 62996 63114 . + 1 ID=CDS:Os01t0101200-01;Parent=transcript:Os01t0101200-01;protein_id=Os01t0101200-01
|
||||
1 irgsp CDS 63248 63345 . + 2 ID=CDS:Os01t0101200-01;Parent=transcript:Os01t0101200-01;protein_id=Os01t0101200-01
|
||||
1 irgsp exon 63248 63576 . + . Parent=transcript:Os01t0101200-01;Name=Os01t0101200-01.exon4;constitutive=0;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0101200-01.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 63346 63576 . + . Parent=transcript:Os01t0101200-01
|
||||
1 irgsp mRNA 62112 65537 . + . ID=transcript:Os01t0101200-02;Parent=gene:Os01g0101200;biotype=protein_coding;transcript_id=Os01t0101200-02
|
||||
1 irgsp five_prime_UTR 62112 62112 . + . Parent=transcript:Os01t0101200-02
|
||||
1 irgsp exon 62112 62295 . + . Parent=transcript:Os01t0101200-02;Name=Os01t0101200-02.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0101200-02.exon1;rank=1
|
||||
1 irgsp CDS 62113 62295 . + 0 ID=CDS:Os01t0101200-02;Parent=transcript:Os01t0101200-02;protein_id=Os01t0101200-02
|
||||
1 irgsp exon 62385 62905 . + . Parent=transcript:Os01t0101200-02;Name=Os01t0101200-02.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0101200-02.exon2;rank=2
|
||||
1 irgsp CDS 62385 62905 . + 0 ID=CDS:Os01t0101200-02;Parent=transcript:Os01t0101200-02;protein_id=Os01t0101200-02
|
||||
1 irgsp exon 62996 63114 . + . Parent=transcript:Os01t0101200-02;Name=Os01t0101200-02.exon3;constitutive=1;ensembl_end_phase=1;ensembl_phase=2;exon_id=Os01t0101200-02.exon3;rank=3
|
||||
1 irgsp CDS 62996 63114 . + 1 ID=CDS:Os01t0101200-02;Parent=transcript:Os01t0101200-02;protein_id=Os01t0101200-02
|
||||
1 irgsp CDS 63248 63345 . + 2 ID=CDS:Os01t0101200-02;Parent=transcript:Os01t0101200-02;protein_id=Os01t0101200-02
|
||||
1 irgsp exon 63248 65537 . + . Parent=transcript:Os01t0101200-02;Name=Os01t0101200-02.exon4;constitutive=0;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0101200-02.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 63346 65537 . + . Parent=transcript:Os01t0101200-02
|
||||
###
|
||||
1 irgsp gene 63350 66302 . - . ID=gene:Os01g0101300;biotype=protein_coding;description=Similar to MRNA%2C partial cds%2C clone: RAFL22-26-L17. (Fragment). (Os01t0101300-01);gene_id=Os01g0101300;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 63350 66302 . - . ID=transcript:Os01t0101300-01;Parent=gene:Os01g0101300;biotype=protein_coding;transcript_id=Os01t0101300-01
|
||||
1 irgsp three_prime_UTR 63350 63669 . - . Parent=transcript:Os01t0101300-01
|
||||
1 irgsp exon 63350 63783 . - . Parent=transcript:Os01t0101300-01;Name=Os01t0101300-01.exon7;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0101300-01.exon7;rank=7
|
||||
1 irgsp CDS 63670 63783 . - 0 ID=CDS:Os01t0101300-01;Parent=transcript:Os01t0101300-01;protein_id=Os01t0101300-01
|
||||
1 irgsp exon 63877 64020 . - . Parent=transcript:Os01t0101300-01;Name=Os01t0101300-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0101300-01.exon6;rank=6
|
||||
1 irgsp CDS 63877 64020 . - 0 ID=CDS:Os01t0101300-01;Parent=transcript:Os01t0101300-01;protein_id=Os01t0101300-01
|
||||
1 irgsp exon 64339 64431 . - . Parent=transcript:Os01t0101300-01;Name=Os01t0101300-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0101300-01.exon5;rank=5
|
||||
1 irgsp CDS 64339 64431 . - 0 ID=CDS:Os01t0101300-01;Parent=transcript:Os01t0101300-01;protein_id=Os01t0101300-01
|
||||
1 irgsp exon 64665 64779 . - . Parent=transcript:Os01t0101300-01;Name=Os01t0101300-01.exon4;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0101300-01.exon4;rank=4
|
||||
1 irgsp CDS 64665 64779 . - 1 ID=CDS:Os01t0101300-01;Parent=transcript:Os01t0101300-01;protein_id=Os01t0101300-01
|
||||
1 irgsp exon 64902 65152 . - . Parent=transcript:Os01t0101300-01;Name=Os01t0101300-01.exon3;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0101300-01.exon3;rank=3
|
||||
1 irgsp CDS 64902 65152 . - 0 ID=CDS:Os01t0101300-01;Parent=transcript:Os01t0101300-01;protein_id=Os01t0101300-01
|
||||
1 irgsp exon 65248 65431 . - . Parent=transcript:Os01t0101300-01;Name=Os01t0101300-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0101300-01.exon2;rank=2
|
||||
1 irgsp CDS 65248 65431 . - 1 ID=CDS:Os01t0101300-01;Parent=transcript:Os01t0101300-01;protein_id=Os01t0101300-01
|
||||
1 irgsp CDS 65628 65950 . - 0 ID=CDS:Os01t0101300-01;Parent=transcript:Os01t0101300-01;protein_id=Os01t0101300-01
|
||||
1 irgsp exon 65628 66302 . - . Parent=transcript:Os01t0101300-01;Name=Os01t0101300-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0101300-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 65951 66302 . - . Parent=transcript:Os01t0101300-01
|
||||
###
|
||||
1 irgsp gene 72816 78349 . + . ID=gene:Os01g0101600;biotype=protein_coding;description=Immunoglobulin-like fold domain containing protein. (Os01t0101600-01)%3BImmunoglobulin-like fold domain containing protein. (Os01t0101600-02)%3BHypothetical conserved gene. (Os01t0101600-03);gene_id=Os01g0101600;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 72816 78349 . + . ID=transcript:Os01t0101600-01;Parent=gene:Os01g0101600;biotype=protein_coding;transcript_id=Os01t0101600-01
|
||||
1 irgsp five_prime_UTR 72816 72902 . + . Parent=transcript:Os01t0101600-01
|
||||
1 irgsp exon 72816 73935 . + . Parent=transcript:Os01t0101600-01;Name=Os01t0101600-01.exon1;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0101600-01.exon1;rank=1
|
||||
1 irgsp CDS 72903 73935 . + 0 ID=CDS:Os01t0101600-01;Parent=transcript:Os01t0101600-01;protein_id=Os01t0101600-01
|
||||
1 irgsp exon 74468 74981 . + . Parent=transcript:Os01t0101600-01;Name=Os01t0101600-02.exon2;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=Os01t0101600-02.exon2;rank=2
|
||||
1 irgsp CDS 74468 74981 . + 2 ID=CDS:Os01t0101600-01;Parent=transcript:Os01t0101600-01;protein_id=Os01t0101600-01
|
||||
1 irgsp CDS 75619 77008 . + 1 ID=CDS:Os01t0101600-01;Parent=transcript:Os01t0101600-01;protein_id=Os01t0101600-01
|
||||
1 irgsp exon 75619 77205 . + . Parent=transcript:Os01t0101600-01;Name=Os01t0101600-01.exon3;constitutive=0;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0101600-01.exon3;rank=3
|
||||
1 irgsp three_prime_UTR 77009 77205 . + . Parent=transcript:Os01t0101600-01
|
||||
1 irgsp exon 77333 78349 . + . Parent=transcript:Os01t0101600-01;Name=Os01t0101600-01.exon4;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0101600-01.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 77333 78349 . + . Parent=transcript:Os01t0101600-01
|
||||
1 irgsp mRNA 72823 77699 . + . ID=transcript:Os01t0101600-02;Parent=gene:Os01g0101600;biotype=protein_coding;transcript_id=Os01t0101600-02
|
||||
1 irgsp five_prime_UTR 72823 72902 . + . Parent=transcript:Os01t0101600-02
|
||||
1 irgsp exon 72823 73935 . + . Parent=transcript:Os01t0101600-02;Name=Os01t0101600-02.exon1;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0101600-02.exon1;rank=1
|
||||
1 irgsp CDS 72903 73935 . + 0 ID=CDS:Os01t0101600-02;Parent=transcript:Os01t0101600-02;protein_id=Os01t0101600-02
|
||||
1 irgsp exon 74468 74981 . + . Parent=transcript:Os01t0101600-02;Name=Os01t0101600-02.exon2;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=Os01t0101600-02.exon2;rank=2
|
||||
1 irgsp CDS 74468 74981 . + 2 ID=CDS:Os01t0101600-02;Parent=transcript:Os01t0101600-02;protein_id=Os01t0101600-02
|
||||
1 irgsp CDS 75619 77008 . + 1 ID=CDS:Os01t0101600-02;Parent=transcript:Os01t0101600-02;protein_id=Os01t0101600-02
|
||||
1 irgsp exon 75619 77699 . + . Parent=transcript:Os01t0101600-02;Name=Os01t0101600-02.exon3;constitutive=0;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0101600-02.exon3;rank=3
|
||||
1 irgsp three_prime_UTR 77009 77699 . + . Parent=transcript:Os01t0101600-02
|
||||
1 irgsp mRNA 75942 77699 . + . ID=transcript:Os01t0101600-03;Parent=gene:Os01g0101600;biotype=protein_coding;transcript_id=Os01t0101600-03
|
||||
1 irgsp five_prime_UTR 75942 75943 . + . Parent=transcript:Os01t0101600-03
|
||||
1 irgsp exon 75942 77699 . + . Parent=transcript:Os01t0101600-03;Name=Os01t0101600-03.exon1;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0101600-03.exon1;rank=1
|
||||
1 irgsp CDS 75944 77008 . + 0 ID=CDS:Os01t0101600-03;Parent=transcript:Os01t0101600-03;protein_id=Os01t0101600-03
|
||||
1 irgsp three_prime_UTR 77009 77699 . + . Parent=transcript:Os01t0101600-03
|
||||
###
|
||||
1 irgsp gene 82426 84095 . + . ID=gene:Os01g0101700;Name=DnaJ domain protein C1%2C rice DJC26 homolog;biotype=protein_coding;description=Similar to chaperone protein dnaJ 20. (Os01t0101700-00);gene_id=Os01g0101700;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 82426 84095 . + . ID=transcript:Os01t0101700-00;Parent=gene:Os01g0101700;biotype=protein_coding;transcript_id=Os01t0101700-00
|
||||
1 irgsp five_prime_UTR 82426 82506 . + . Parent=transcript:Os01t0101700-00
|
||||
1 irgsp exon 82426 82932 . + . Parent=transcript:Os01t0101700-00;Name=Os01t0101700-00.exon1;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0101700-00.exon1;rank=1
|
||||
1 irgsp CDS 82507 82932 . + 0 ID=CDS:Os01t0101700-00;Parent=transcript:Os01t0101700-00;protein_id=Os01t0101700-00
|
||||
1 irgsp CDS 83724 83864 . + 0 ID=CDS:Os01t0101700-00;Parent=transcript:Os01t0101700-00;protein_id=Os01t0101700-00
|
||||
1 irgsp exon 83724 84095 . + . Parent=transcript:Os01t0101700-00;Name=Os01t0101700-00.exon2;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0101700-00.exon2;rank=2
|
||||
1 irgsp three_prime_UTR 83865 84095 . + . Parent=transcript:Os01t0101700-00
|
||||
###
|
||||
1 irgsp gene 85337 88844 . + . ID=gene:Os01g0101800;biotype=protein_coding;description=Conserved hypothetical protein. (Os01t0101800-01);gene_id=Os01g0101800;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 85337 88844 . + . ID=transcript:Os01t0101800-01;Parent=gene:Os01g0101800;biotype=protein_coding;transcript_id=Os01t0101800-01
|
||||
1 irgsp five_prime_UTR 85337 85378 . + . Parent=transcript:Os01t0101800-01
|
||||
1 irgsp exon 85337 85600 . + . Parent=transcript:Os01t0101800-01;Name=Os01t0101800-01.exon1;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0101800-01.exon1;rank=1
|
||||
1 irgsp CDS 85379 85600 . + 0 ID=CDS:Os01t0101800-01;Parent=transcript:Os01t0101800-01;protein_id=Os01t0101800-01
|
||||
1 irgsp exon 85737 85830 . + . Parent=transcript:Os01t0101800-01;Name=Os01t0101800-01.exon2;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0101800-01.exon2;rank=2
|
||||
1 irgsp CDS 85737 85830 . + 0 ID=CDS:Os01t0101800-01;Parent=transcript:Os01t0101800-01;protein_id=Os01t0101800-01
|
||||
1 irgsp exon 85935 86086 . + . Parent=transcript:Os01t0101800-01;Name=Os01t0101800-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0101800-01.exon3;rank=3
|
||||
1 irgsp CDS 85935 86086 . + 2 ID=CDS:Os01t0101800-01;Parent=transcript:Os01t0101800-01;protein_id=Os01t0101800-01
|
||||
1 irgsp exon 86212 86299 . + . Parent=transcript:Os01t0101800-01;Name=Os01t0101800-01.exon4;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0101800-01.exon4;rank=4
|
||||
1 irgsp CDS 86212 86299 . + 0 ID=CDS:Os01t0101800-01;Parent=transcript:Os01t0101800-01;protein_id=Os01t0101800-01
|
||||
1 irgsp exon 86399 87681 . + . Parent=transcript:Os01t0101800-01;Name=Os01t0101800-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0101800-01.exon5;rank=5
|
||||
1 irgsp CDS 86399 87681 . + 2 ID=CDS:Os01t0101800-01;Parent=transcript:Os01t0101800-01;protein_id=Os01t0101800-01
|
||||
1 irgsp exon 88291 88398 . + . Parent=transcript:Os01t0101800-01;Name=Os01t0101800-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0101800-01.exon6;rank=6
|
||||
1 irgsp CDS 88291 88398 . + 0 ID=CDS:Os01t0101800-01;Parent=transcript:Os01t0101800-01;protein_id=Os01t0101800-01
|
||||
1 irgsp CDS 88500 88583 . + 0 ID=CDS:Os01t0101800-01;Parent=transcript:Os01t0101800-01;protein_id=Os01t0101800-01
|
||||
1 irgsp exon 88500 88844 . + . Parent=transcript:Os01t0101800-01;Name=Os01t0101800-01.exon7;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0101800-01.exon7;rank=7
|
||||
1 irgsp three_prime_UTR 88584 88844 . + . Parent=transcript:Os01t0101800-01
|
||||
###
|
||||
1 irgsp gene 86211 88583 . - . ID=gene:Os01g0101850;biotype=protein_coding;description=Hypothetical protein. (Os01t0101850-00);gene_id=Os01g0101850;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 86211 88583 . - . ID=transcript:Os01t0101850-00;Parent=gene:Os01g0101850;biotype=protein_coding;transcript_id=Os01t0101850-00
|
||||
1 irgsp exon 86211 86277 . - . Parent=transcript:Os01t0101850-00;Name=Os01t0101850-00.exon4;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0101850-00.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 86211 86277 . - . Parent=transcript:Os01t0101850-00
|
||||
1 irgsp three_prime_UTR 86384 87326 . - . Parent=transcript:Os01t0101850-00
|
||||
1 irgsp exon 86384 87694 . - . Parent=transcript:Os01t0101850-00;Name=Os01t0101850-00.exon3;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0101850-00.exon3;rank=3
|
||||
1 irgsp CDS 87327 87662 . - 0 ID=CDS:Os01t0101850-00;Parent=transcript:Os01t0101850-00;protein_id=Os01t0101850-00
|
||||
1 irgsp five_prime_UTR 87663 87694 . - . Parent=transcript:Os01t0101850-00
|
||||
1 irgsp exon 88308 88396 . - . Parent=transcript:Os01t0101850-00;Name=Os01t0101850-00.exon2;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0101850-00.exon2;rank=2
|
||||
1 irgsp five_prime_UTR 88308 88396 . - . Parent=transcript:Os01t0101850-00
|
||||
1 irgsp exon 88496 88583 . - . Parent=transcript:Os01t0101850-00;Name=Os01t0101850-00.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0101850-00.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 88496 88583 . - . Parent=transcript:Os01t0101850-00
|
||||
###
|
||||
1 irgsp gene 88883 89228 . - . ID=gene:Os01g0101900;biotype=protein_coding;description=Similar to OSIGBa0075F02.3 protein. (Os01t0101900-00);gene_id=Os01g0101900;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 88883 89228 . - . ID=transcript:Os01t0101900-00;Parent=gene:Os01g0101900;biotype=protein_coding;transcript_id=Os01t0101900-00
|
||||
1 irgsp three_prime_UTR 88883 88985 . - . Parent=transcript:Os01t0101900-00
|
||||
1 irgsp exon 88883 89228 . - . Parent=transcript:Os01t0101900-00;Name=Os01t0101900-00.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0101900-00.exon1;rank=1
|
||||
1 irgsp CDS 88986 89204 . - 0 ID=CDS:Os01t0101900-00;Parent=transcript:Os01t0101900-00;protein_id=Os01t0101900-00
|
||||
1 irgsp five_prime_UTR 89205 89228 . - . Parent=transcript:Os01t0101900-00
|
||||
###
|
||||
1 irgsp gene 89763 91465 . - . ID=gene:Os01g0102000;Name=NON-SPECIFIC PHOSPHOLIPASE C5;biotype=protein_coding;description=Phosphoesterase family protein. (Os01t0102000-01);gene_id=Os01g0102000;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 89763 91465 . - . ID=transcript:Os01t0102000-01;Parent=gene:Os01g0102000;biotype=protein_coding;transcript_id=Os01t0102000-01
|
||||
1 irgsp three_prime_UTR 89763 89824 . - . Parent=transcript:Os01t0102000-01
|
||||
1 irgsp exon 89763 91465 . - . Parent=transcript:Os01t0102000-01;Name=Os01t0102000-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0102000-01.exon1;rank=1
|
||||
1 irgsp CDS 89825 91411 . - 0 ID=CDS:Os01t0102000-01;Parent=transcript:Os01t0102000-01;protein_id=Os01t0102000-01
|
||||
1 irgsp five_prime_UTR 91412 91465 . - . Parent=transcript:Os01t0102000-01
|
||||
###
|
||||
1 irgsp gene 134300 135439 . + . ID=gene:Os01g0102300;Name=OsTLP27;biotype=protein_coding;description=Thylakoid lumen protein%2C Photosynthesis and chloroplast development (Os01t0102300-01);gene_id=Os01g0102300;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 134300 135439 . + . ID=transcript:Os01t0102300-01;Parent=gene:Os01g0102300;biotype=protein_coding;transcript_id=Os01t0102300-01
|
||||
1 irgsp five_prime_UTR 134300 134310 . + . Parent=transcript:Os01t0102300-01
|
||||
1 irgsp exon 134300 134615 . + . Parent=transcript:Os01t0102300-01;Name=Os01t0102300-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0102300-01.exon1;rank=1
|
||||
1 irgsp CDS 134311 134615 . + 0 ID=CDS:Os01t0102300-01;Parent=transcript:Os01t0102300-01;protein_id=Os01t0102300-01
|
||||
1 irgsp exon 134698 134824 . + . Parent=transcript:Os01t0102300-01;Name=Os01t0102300-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0102300-01.exon2;rank=2
|
||||
1 irgsp CDS 134698 134824 . + 1 ID=CDS:Os01t0102300-01;Parent=transcript:Os01t0102300-01;protein_id=Os01t0102300-01
|
||||
1 irgsp CDS 134912 135253 . + 0 ID=CDS:Os01t0102300-01;Parent=transcript:Os01t0102300-01;protein_id=Os01t0102300-01
|
||||
1 irgsp exon 134912 135439 . + . Parent=transcript:Os01t0102300-01;Name=Os01t0102300-01.exon3;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0102300-01.exon3;rank=3
|
||||
1 irgsp three_prime_UTR 135254 135439 . + . Parent=transcript:Os01t0102300-01
|
||||
###
|
||||
1 irgsp gene 139826 141555 . + . ID=gene:Os01g0102400;Name=HAP5H SUBUNIT OF CCAAT-BOX BINDING COMPLEX;biotype=protein_coding;description=Histone-fold domain containing protein. (Os01t0102400-01);gene_id=Os01g0102400;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 139826 141555 . + . ID=transcript:Os01t0102400-01;Parent=gene:Os01g0102400;biotype=protein_coding;transcript_id=Os01t0102400-01
|
||||
1 irgsp exon 139826 139906 . + . Parent=transcript:Os01t0102400-01;Name=Os01t0102400-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0102400-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 139826 139906 . + . Parent=transcript:Os01t0102400-01
|
||||
1 irgsp five_prime_UTR 140120 140149 . + . Parent=transcript:Os01t0102400-01
|
||||
1 irgsp exon 140120 141555 . + . Parent=transcript:Os01t0102400-01;Name=Os01t0102400-01.exon2;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0102400-01.exon2;rank=2
|
||||
1 irgsp CDS 140150 141415 . + 0 ID=CDS:Os01t0102400-01;Parent=transcript:Os01t0102400-01;protein_id=Os01t0102400-01
|
||||
1 irgsp three_prime_UTR 141416 141555 . + . Parent=transcript:Os01t0102400-01
|
||||
###
|
||||
1 irgsp gene 141959 144554 . + . ID=gene:Os01g0102500;biotype=protein_coding;description=Conserved hypothetical protein. (Os01t0102500-01);gene_id=Os01g0102500;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 141959 144554 . + . ID=transcript:Os01t0102500-01;Parent=gene:Os01g0102500;biotype=protein_coding;transcript_id=Os01t0102500-01
|
||||
1 irgsp five_prime_UTR 141959 142083 . + . Parent=transcript:Os01t0102500-01
|
||||
1 irgsp exon 141959 142631 . + . Parent=transcript:Os01t0102500-01;Name=Os01t0102500-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0102500-01.exon1;rank=1
|
||||
1 irgsp CDS 142084 142631 . + 0 ID=CDS:Os01t0102500-01;Parent=transcript:Os01t0102500-01;protein_id=Os01t0102500-01
|
||||
1 irgsp exon 143191 143431 . + . Parent=transcript:Os01t0102500-01;Name=Os01t0102500-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0102500-01.exon2;rank=2
|
||||
1 irgsp CDS 143191 143431 . + 1 ID=CDS:Os01t0102500-01;Parent=transcript:Os01t0102500-01;protein_id=Os01t0102500-01
|
||||
1 irgsp exon 143563 143680 . + . Parent=transcript:Os01t0102500-01;Name=Os01t0102500-01.exon3;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0102500-01.exon3;rank=3
|
||||
1 irgsp CDS 143563 143680 . + 0 ID=CDS:Os01t0102500-01;Parent=transcript:Os01t0102500-01;protein_id=Os01t0102500-01
|
||||
1 irgsp CDS 143817 143908 . + 2 ID=CDS:Os01t0102500-01;Parent=transcript:Os01t0102500-01;protein_id=Os01t0102500-01
|
||||
1 irgsp exon 143817 144554 . + . Parent=transcript:Os01t0102500-01;Name=Os01t0102500-01.exon4;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0102500-01.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 143909 144554 . + . Parent=transcript:Os01t0102500-01
|
||||
###
|
||||
1 irgsp gene 145603 147847 . + . ID=gene:Os01g0102600;Name=Shikimate kinase 4;biotype=protein_coding;description=Shikimate kinase domain containing protein. (Os01t0102600-01)%3BSimilar to shikimate kinase family protein. (Os01t0102600-02);gene_id=Os01g0102600;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 145603 147847 . + . ID=transcript:Os01t0102600-01;Parent=gene:Os01g0102600;biotype=protein_coding;transcript_id=Os01t0102600-01
|
||||
1 irgsp five_prime_UTR 145603 145644 . + . Parent=transcript:Os01t0102600-01
|
||||
1 irgsp exon 145603 145786 . + . Parent=transcript:Os01t0102600-01;Name=Os01t0102600-01.exon1;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0102600-01.exon1;rank=1
|
||||
1 irgsp CDS 145645 145786 . + 0 ID=CDS:Os01t0102600-01;Parent=transcript:Os01t0102600-01;protein_id=Os01t0102600-01
|
||||
1 irgsp exon 145905 145951 . + . Parent=transcript:Os01t0102600-01;Name=Os01t0102600-01.exon2;constitutive=0;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0102600-01.exon2;rank=2
|
||||
1 irgsp CDS 145905 145951 . + 2 ID=CDS:Os01t0102600-01;Parent=transcript:Os01t0102600-01;protein_id=Os01t0102600-01
|
||||
1 irgsp exon 146028 146082 . + . Parent=transcript:Os01t0102600-01;Name=Os01t0102600-01.exon3;constitutive=0;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0102600-01.exon3;rank=3
|
||||
1 irgsp CDS 146028 146082 . + 0 ID=CDS:Os01t0102600-01;Parent=transcript:Os01t0102600-01;protein_id=Os01t0102600-01
|
||||
1 irgsp exon 146179 146339 . + . Parent=transcript:Os01t0102600-01;Name=Os01t0102600-01.exon4;constitutive=0;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0102600-01.exon4;rank=4
|
||||
1 irgsp CDS 146179 146339 . + 2 ID=CDS:Os01t0102600-01;Parent=transcript:Os01t0102600-01;protein_id=Os01t0102600-01
|
||||
1 irgsp exon 146450 146532 . + . Parent=transcript:Os01t0102600-01;Name=Os01t0102600-01.exon5;constitutive=0;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0102600-01.exon5;rank=5
|
||||
1 irgsp CDS 146450 146532 . + 0 ID=CDS:Os01t0102600-01;Parent=transcript:Os01t0102600-01;protein_id=Os01t0102600-01
|
||||
1 irgsp exon 146611 146719 . + . Parent=transcript:Os01t0102600-01;Name=Os01t0102600-01.exon6;constitutive=0;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0102600-01.exon6;rank=6
|
||||
1 irgsp CDS 146611 146719 . + 1 ID=CDS:Os01t0102600-01;Parent=transcript:Os01t0102600-01;protein_id=Os01t0102600-01
|
||||
1 irgsp exon 147106 147184 . + . Parent=transcript:Os01t0102600-01;Name=Os01t0102600-01.exon7;constitutive=0;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0102600-01.exon7;rank=7
|
||||
1 irgsp CDS 147106 147184 . + 0 ID=CDS:Os01t0102600-01;Parent=transcript:Os01t0102600-01;protein_id=Os01t0102600-01
|
||||
1 irgsp exon 147311 147375 . + . Parent=transcript:Os01t0102600-01;Name=Os01t0102600-02.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0102600-02.exon2;rank=8
|
||||
1 irgsp CDS 147311 147375 . + 2 ID=CDS:Os01t0102600-01;Parent=transcript:Os01t0102600-01;protein_id=Os01t0102600-01
|
||||
1 irgsp CDS 147507 147575 . + 0 ID=CDS:Os01t0102600-01;Parent=transcript:Os01t0102600-01;protein_id=Os01t0102600-01
|
||||
1 irgsp exon 147507 147847 . + . Parent=transcript:Os01t0102600-01;Name=Os01t0102600-01.exon9;constitutive=0;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0102600-01.exon9;rank=9
|
||||
1 irgsp three_prime_UTR 147576 147847 . + . Parent=transcript:Os01t0102600-01
|
||||
1 irgsp mRNA 147104 147805 . + . ID=transcript:Os01t0102600-02;Parent=gene:Os01g0102600;biotype=protein_coding;transcript_id=Os01t0102600-02
|
||||
1 irgsp five_prime_UTR 147104 147105 . + . Parent=transcript:Os01t0102600-02
|
||||
1 irgsp exon 147104 147184 . + . Parent=transcript:Os01t0102600-02;Name=Os01t0102600-02.exon1;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0102600-02.exon1;rank=1
|
||||
1 irgsp CDS 147106 147184 . + 0 ID=CDS:Os01t0102600-02;Parent=transcript:Os01t0102600-02;protein_id=Os01t0102600-02
|
||||
1 irgsp exon 147311 147375 . + . Parent=transcript:Os01t0102600-02;Name=Os01t0102600-02.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0102600-02.exon2;rank=2
|
||||
1 irgsp CDS 147311 147375 . + 2 ID=CDS:Os01t0102600-02;Parent=transcript:Os01t0102600-02;protein_id=Os01t0102600-02
|
||||
1 irgsp CDS 147507 147575 . + 0 ID=CDS:Os01t0102600-02;Parent=transcript:Os01t0102600-02;protein_id=Os01t0102600-02
|
||||
1 irgsp exon 147507 147805 . + . Parent=transcript:Os01t0102600-02;Name=Os01t0102600-02.exon3;constitutive=0;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0102600-02.exon3;rank=3
|
||||
1 irgsp three_prime_UTR 147576 147805 . + . Parent=transcript:Os01t0102600-02
|
||||
###
|
||||
1 irgsp gene 148085 150568 . + . ID=gene:Os01g0102700;biotype=protein_coding;description=Translocon-associated beta family protein. (Os01t0102700-01);gene_id=Os01g0102700;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 148085 150568 . + . ID=transcript:Os01t0102700-01;Parent=gene:Os01g0102700;biotype=protein_coding;transcript_id=Os01t0102700-01
|
||||
1 irgsp five_prime_UTR 148085 148146 . + . Parent=transcript:Os01t0102700-01
|
||||
1 irgsp exon 148085 148313 . + . Parent=transcript:Os01t0102700-01;Name=Os01t0102700-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0102700-01.exon1;rank=1
|
||||
1 irgsp CDS 148147 148313 . + 0 ID=CDS:Os01t0102700-01;Parent=transcript:Os01t0102700-01;protein_id=Os01t0102700-01
|
||||
1 irgsp exon 149450 149548 . + . Parent=transcript:Os01t0102700-01;Name=Os01t0102700-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=2;exon_id=Os01t0102700-01.exon2;rank=2
|
||||
1 irgsp CDS 149450 149548 . + 1 ID=CDS:Os01t0102700-01;Parent=transcript:Os01t0102700-01;protein_id=Os01t0102700-01
|
||||
1 irgsp exon 149634 149742 . + . Parent=transcript:Os01t0102700-01;Name=Os01t0102700-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0102700-01.exon3;rank=3
|
||||
1 irgsp CDS 149634 149742 . + 1 ID=CDS:Os01t0102700-01;Parent=transcript:Os01t0102700-01;protein_id=Os01t0102700-01
|
||||
1 irgsp exon 149856 149931 . + . Parent=transcript:Os01t0102700-01;Name=Os01t0102700-01.exon4;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0102700-01.exon4;rank=4
|
||||
1 irgsp CDS 149856 149931 . + 0 ID=CDS:Os01t0102700-01;Parent=transcript:Os01t0102700-01;protein_id=Os01t0102700-01
|
||||
1 irgsp CDS 150152 150318 . + 2 ID=CDS:Os01t0102700-01;Parent=transcript:Os01t0102700-01;protein_id=Os01t0102700-01
|
||||
1 irgsp exon 150152 150568 . + . Parent=transcript:Os01t0102700-01;Name=Os01t0102700-01.exon5;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0102700-01.exon5;rank=5
|
||||
1 irgsp three_prime_UTR 150319 150568 . + . Parent=transcript:Os01t0102700-01
|
||||
###
|
||||
1 irgsp gene 152853 156449 . + . ID=gene:Os01g0102800;Name=Cockayne syndrome WD-repeat protein;biotype=protein_coding;description=Similar to chromatin remodeling complex subunit. (Os01t0102800-01);gene_id=Os01g0102800;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 152853 156449 . + . ID=transcript:Os01t0102800-01;Parent=gene:Os01g0102800;biotype=protein_coding;transcript_id=Os01t0102800-01
|
||||
1 irgsp five_prime_UTR 152853 152853 . + . Parent=transcript:Os01t0102800-01
|
||||
1 irgsp exon 152853 153025 . + . Parent=transcript:Os01t0102800-01;Name=Os01t0102800-01.exon1;constitutive=1;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0102800-01.exon1;rank=1
|
||||
1 irgsp CDS 152854 153025 . + 0 ID=CDS:Os01t0102800-01;Parent=transcript:Os01t0102800-01;protein_id=Os01t0102800-01
|
||||
1 irgsp exon 153178 154646 . + . Parent=transcript:Os01t0102800-01;Name=Os01t0102800-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0102800-01.exon2;rank=2
|
||||
1 irgsp CDS 153178 154646 . + 2 ID=CDS:Os01t0102800-01;Parent=transcript:Os01t0102800-01;protein_id=Os01t0102800-01
|
||||
1 irgsp exon 155010 155450 . + . Parent=transcript:Os01t0102800-01;Name=Os01t0102800-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0102800-01.exon3;rank=3
|
||||
1 irgsp CDS 155010 155450 . + 0 ID=CDS:Os01t0102800-01;Parent=transcript:Os01t0102800-01;protein_id=Os01t0102800-01
|
||||
1 irgsp CDS 155543 156214 . + 0 ID=CDS:Os01t0102800-01;Parent=transcript:Os01t0102800-01;protein_id=Os01t0102800-01
|
||||
1 irgsp exon 155543 156449 . + . Parent=transcript:Os01t0102800-01;Name=Os01t0102800-01.exon4;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0102800-01.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 156215 156449 . + . Parent=transcript:Os01t0102800-01
|
||||
###
|
||||
1 irgsp gene 164577 168921 . + . ID=gene:Os01g0102850;biotype=protein_coding;description=Similar to nitrilase 2. (Os01t0102850-00);gene_id=Os01g0102850;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 164577 168921 . + . ID=transcript:Os01t0102850-00;Parent=gene:Os01g0102850;biotype=protein_coding;transcript_id=Os01t0102850-00
|
||||
1 irgsp exon 164577 164905 . + . Parent=transcript:Os01t0102850-00;Name=Os01t0102850-00.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0102850-00.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 164577 164905 . + . Parent=transcript:Os01t0102850-00
|
||||
1 irgsp five_prime_UTR 168499 168804 . + . Parent=transcript:Os01t0102850-00
|
||||
1 irgsp exon 168499 168921 . + . Parent=transcript:Os01t0102850-00;Name=Os01t0102850-00.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0102850-00.exon2;rank=2
|
||||
1 irgsp CDS 168805 168921 . + 0 ID=CDS:Os01t0102850-00;Parent=transcript:Os01t0102850-00;protein_id=Os01t0102850-00
|
||||
###
|
||||
1 irgsp gene 169390 170316 . - . ID=gene:Os01g0102900;Name=LIGHT-REGULATED GENE 1;biotype=protein_coding;description=Light-regulated protein%2C Regulation of light-dependent attachment of LEAF-TYPE FERREDOXIN-NADP+ OXIDOREDUCTASE (LFNR) to the thylakoid membrane (Os01t0102900-01);gene_id=Os01g0102900;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 169390 170316 . - . ID=transcript:Os01t0102900-01;Parent=gene:Os01g0102900;biotype=protein_coding;transcript_id=Os01t0102900-01
|
||||
1 irgsp three_prime_UTR 169390 169598 . - . Parent=transcript:Os01t0102900-01
|
||||
1 irgsp exon 169390 169656 . - . Parent=transcript:Os01t0102900-01;Name=Os01t0102900-01.exon3;constitutive=1;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0102900-01.exon3;rank=3
|
||||
1 irgsp CDS 169599 169656 . - 1 ID=CDS:Os01t0102900-01;Parent=transcript:Os01t0102900-01;protein_id=Os01t0102900-01
|
||||
1 irgsp exon 169751 169909 . - . Parent=transcript:Os01t0102900-01;Name=Os01t0102900-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=2;exon_id=Os01t0102900-01.exon2;rank=2
|
||||
1 irgsp CDS 169751 169909 . - 1 ID=CDS:Os01t0102900-01;Parent=transcript:Os01t0102900-01;protein_id=Os01t0102900-01
|
||||
1 irgsp CDS 170091 170260 . - 0 ID=CDS:Os01t0102900-01;Parent=transcript:Os01t0102900-01;protein_id=Os01t0102900-01
|
||||
1 irgsp exon 170091 170316 . - . Parent=transcript:Os01t0102900-01;Name=Os01t0102900-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0102900-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 170261 170316 . - . Parent=transcript:Os01t0102900-01
|
||||
###
|
||||
1 irgsp gene 170798 173144 . - . ID=gene:Os01g0103000;biotype=protein_coding;description=Snf7 family protein. (Os01t0103000-01);gene_id=Os01g0103000;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 170798 173144 . - . ID=transcript:Os01t0103000-01;Parent=gene:Os01g0103000;biotype=protein_coding;transcript_id=Os01t0103000-01
|
||||
1 irgsp three_prime_UTR 170798 171044 . - . Parent=transcript:Os01t0103000-01
|
||||
1 irgsp exon 170798 171095 . - . Parent=transcript:Os01t0103000-01;Name=Os01t0103000-01.exon7;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0103000-01.exon7;rank=7
|
||||
1 irgsp CDS 171045 171095 . - 0 ID=CDS:Os01t0103000-01;Parent=transcript:Os01t0103000-01;protein_id=Os01t0103000-01
|
||||
1 irgsp exon 171406 171554 . - . Parent=transcript:Os01t0103000-01;Name=Os01t0103000-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0103000-01.exon6;rank=6
|
||||
1 irgsp CDS 171406 171554 . - 2 ID=CDS:Os01t0103000-01;Parent=transcript:Os01t0103000-01;protein_id=Os01t0103000-01
|
||||
1 irgsp exon 171764 171875 . - . Parent=transcript:Os01t0103000-01;Name=Os01t0103000-01.exon5;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0103000-01.exon5;rank=5
|
||||
1 irgsp CDS 171764 171875 . - 0 ID=CDS:Os01t0103000-01;Parent=transcript:Os01t0103000-01;protein_id=Os01t0103000-01
|
||||
1 irgsp exon 172398 172469 . - . Parent=transcript:Os01t0103000-01;Name=Os01t0103000-01.exon4;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0103000-01.exon4;rank=4
|
||||
1 irgsp CDS 172398 172469 . - 0 ID=CDS:Os01t0103000-01;Parent=transcript:Os01t0103000-01;protein_id=Os01t0103000-01
|
||||
1 irgsp exon 172578 172671 . - . Parent=transcript:Os01t0103000-01;Name=Os01t0103000-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0103000-01.exon3;rank=3
|
||||
1 irgsp CDS 172578 172671 . - 1 ID=CDS:Os01t0103000-01;Parent=transcript:Os01t0103000-01;protein_id=Os01t0103000-01
|
||||
1 irgsp exon 172770 172921 . - . Parent=transcript:Os01t0103000-01;Name=Os01t0103000-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0103000-01.exon2;rank=2
|
||||
1 irgsp CDS 172770 172921 . - 0 ID=CDS:Os01t0103000-01;Parent=transcript:Os01t0103000-01;protein_id=Os01t0103000-01
|
||||
1 irgsp CDS 173004 173072 . - 0 ID=CDS:Os01t0103000-01;Parent=transcript:Os01t0103000-01;protein_id=Os01t0103000-01
|
||||
1 irgsp exon 173004 173144 . - . Parent=transcript:Os01t0103000-01;Name=Os01t0103000-01.exon1;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0103000-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 173073 173144 . - . Parent=transcript:Os01t0103000-01
|
||||
###
|
||||
1 irgsp gene 178607 180575 . + . ID=gene:Os01g0103100;biotype=protein_coding;description=TGF-beta receptor%2C type I/II extracellular region family protein. (Os01t0103100-01)%3BSimilar to predicted protein. (Os01t0103100-02);gene_id=Os01g0103100;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 178607 180548 . + . ID=transcript:Os01t0103100-01;Parent=gene:Os01g0103100;biotype=protein_coding;transcript_id=Os01t0103100-01
|
||||
1 irgsp five_prime_UTR 178607 178641 . + . Parent=transcript:Os01t0103100-01
|
||||
1 irgsp exon 178607 180548 . + . Parent=transcript:Os01t0103100-01;Name=Os01t0103100-01.exon1;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0103100-01.exon1;rank=1
|
||||
1 irgsp CDS 178642 180462 . + 0 ID=CDS:Os01t0103100-01;Parent=transcript:Os01t0103100-01;protein_id=Os01t0103100-01
|
||||
1 irgsp three_prime_UTR 180463 180548 . + . Parent=transcript:Os01t0103100-01
|
||||
1 irgsp mRNA 178652 180575 . + . ID=transcript:Os01t0103100-02;Parent=gene:Os01g0103100;biotype=protein_coding;transcript_id=Os01t0103100-02
|
||||
1 irgsp five_prime_UTR 178652 178677 . + . Parent=transcript:Os01t0103100-02
|
||||
1 irgsp exon 178652 180575 . + . Parent=transcript:Os01t0103100-02;Name=Os01t0103100-02.exon1;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0103100-02.exon1;rank=1
|
||||
1 irgsp CDS 178678 180462 . + 0 ID=CDS:Os01t0103100-02;Parent=transcript:Os01t0103100-02;protein_id=Os01t0103100-02
|
||||
1 irgsp three_prime_UTR 180463 180575 . + . Parent=transcript:Os01t0103100-02
|
||||
###
|
||||
1 irgsp gene 178815 180433 . - . ID=gene:Os01g0103075;biotype=protein_coding;description=Hypothetical protein. (Os01t0103075-00);gene_id=Os01g0103075;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 178815 180433 . - . ID=transcript:Os01t0103075-00;Parent=gene:Os01g0103075;biotype=protein_coding;transcript_id=Os01t0103075-00
|
||||
1 irgsp three_prime_UTR 178815 179511 . - . Parent=transcript:Os01t0103075-00
|
||||
1 irgsp exon 178815 180433 . - . Parent=transcript:Os01t0103075-00;Name=Os01t0103075-00.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0103075-00.exon1;rank=1
|
||||
1 irgsp CDS 179512 180054 . - 0 ID=CDS:Os01t0103075-00;Parent=transcript:Os01t0103075-00;protein_id=Os01t0103075-00
|
||||
1 irgsp five_prime_UTR 180055 180433 . - . Parent=transcript:Os01t0103075-00
|
||||
###
|
||||
1 Ensembl_Plants ncRNA_gene 182074 182154 . + . ID=gene:ENSRNA049442722;Name=tRNA-Leu;biotype=tRNA;description=tRNA-Leu for anticodon AAG;gene_id=ENSRNA049442722;logic_name=trnascan_gene
|
||||
1 Ensembl_Plants tRNA 182074 182154 . + . ID=transcript:ENSRNA049442722-T1;Parent=gene:ENSRNA049442722;biotype=tRNA;transcript_id=ENSRNA049442722-T1
|
||||
1 Ensembl_Plants exon 182074 182154 . + . Parent=transcript:ENSRNA049442722-T1;Name=ENSRNA049442722-E1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=ENSRNA049442722-E1;rank=1
|
||||
###
|
||||
1 irgsp gene 185189 185828 . - . ID=gene:Os01g0103400;biotype=protein_coding;description=Hypothetical gene. (Os01t0103400-01);gene_id=Os01g0103400;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 185189 185828 . - . ID=transcript:Os01t0103400-01;Parent=gene:Os01g0103400;biotype=protein_coding;transcript_id=Os01t0103400-01
|
||||
1 irgsp three_prime_UTR 185189 185434 . - . Parent=transcript:Os01t0103400-01
|
||||
1 irgsp exon 185189 185828 . - . Parent=transcript:Os01t0103400-01;Name=Os01t0103400-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0103400-01.exon1;rank=1
|
||||
1 irgsp CDS 185435 185827 . - 0 ID=CDS:Os01t0103400-01;Parent=transcript:Os01t0103400-01;protein_id=Os01t0103400-01
|
||||
1 irgsp five_prime_UTR 185828 185828 . - . Parent=transcript:Os01t0103400-01
|
||||
###
|
||||
1 irgsp repeat_region 186000 186100 . + . ID=fakeRepeat2
|
||||
###
|
||||
1 irgsp gene 186250 190904 . - . ID=gene:Os01g0103600;biotype=protein_coding;description=Similar to sterol-8%2C7-isomerase. (Os01t0103600-01)%3BEmopamil-binding family protein. (Os01t0103600-02);gene_id=Os01g0103600;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 186250 190262 . - . ID=transcript:Os01t0103600-02;Parent=gene:Os01g0103600;biotype=protein_coding;transcript_id=Os01t0103600-02
|
||||
1 irgsp three_prime_UTR 186250 186515 . - . Parent=transcript:Os01t0103600-02
|
||||
1 irgsp exon 186250 186771 . - . Parent=transcript:Os01t0103600-02;Name=Os01t0103600-02.exon4;constitutive=0;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0103600-02.exon4;rank=4
|
||||
1 irgsp CDS 186516 186771 . - 1 ID=CDS:Os01t0103600-02;Parent=transcript:Os01t0103600-02;protein_id=Os01t0103600-02
|
||||
1 irgsp exon 189607 189715 . - . Parent=transcript:Os01t0103600-02;Name=Os01t0103600-02.exon3;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=Os01t0103600-02.exon3;rank=3
|
||||
1 irgsp CDS 189607 189715 . - 2 ID=CDS:Os01t0103600-02;Parent=transcript:Os01t0103600-02;protein_id=Os01t0103600-02
|
||||
1 irgsp exon 189841 189990 . - . Parent=transcript:Os01t0103600-02;Name=Os01t0103600-02.exon2;constitutive=1;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0103600-02.exon2;rank=2
|
||||
1 irgsp CDS 189841 189990 . - 2 ID=CDS:Os01t0103600-02;Parent=transcript:Os01t0103600-02;protein_id=Os01t0103600-02
|
||||
1 irgsp CDS 190087 190231 . - 0 ID=CDS:Os01t0103600-02;Parent=transcript:Os01t0103600-02;protein_id=Os01t0103600-02
|
||||
1 irgsp exon 190087 190262 . - . Parent=transcript:Os01t0103600-02;Name=Os01t0103600-02.exon1;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0103600-02.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 190232 190262 . - . Parent=transcript:Os01t0103600-02
|
||||
1 irgsp mRNA 187345 190904 . - . ID=transcript:Os01t0103600-01;Parent=gene:Os01g0103600;biotype=protein_coding;transcript_id=Os01t0103600-01
|
||||
1 irgsp three_prime_UTR 187345 189395 . - . Parent=transcript:Os01t0103600-01
|
||||
1 irgsp exon 187345 189715 . - . Parent=transcript:Os01t0103600-01;Name=Os01t0103600-01.exon3;constitutive=0;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0103600-01.exon3;rank=3
|
||||
1 irgsp CDS 189396 189715 . - 2 ID=CDS:Os01t0103600-01;Parent=transcript:Os01t0103600-01;protein_id=Os01t0103600-01
|
||||
1 irgsp exon 189841 189990 . - . Parent=transcript:Os01t0103600-01;Name=Os01t0103600-02.exon2;constitutive=1;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0103600-02.exon2;rank=2
|
||||
1 irgsp CDS 189841 189990 . - 2 ID=CDS:Os01t0103600-01;Parent=transcript:Os01t0103600-01;protein_id=Os01t0103600-01
|
||||
1 irgsp CDS 190087 190231 . - 0 ID=CDS:Os01t0103600-01;Parent=transcript:Os01t0103600-01;protein_id=Os01t0103600-01
|
||||
1 irgsp exon 190087 190904 . - . Parent=transcript:Os01t0103600-01;Name=Os01t0103600-01.exon1;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0103600-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 190232 190904 . - . Parent=transcript:Os01t0103600-01
|
||||
###
|
||||
1 irgsp gene 187545 188586 . + . ID=gene:Os01g0103650;biotype=protein_coding;description=Hypothetical gene. (Os01t0103650-00);gene_id=Os01g0103650;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 187545 188586 . + . ID=transcript:Os01t0103650-00;Parent=gene:Os01g0103650;biotype=protein_coding;transcript_id=Os01t0103650-00
|
||||
1 irgsp five_prime_UTR 187545 187546 . + . Parent=transcript:Os01t0103650-00
|
||||
1 irgsp exon 187545 188020 . + . Parent=transcript:Os01t0103650-00;Name=Os01t0103650-00.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0103650-00.exon1;rank=1
|
||||
1 irgsp CDS 187547 187768 . + 0 ID=CDS:Os01t0103650-00;Parent=transcript:Os01t0103650-00;protein_id=Os01t0103650-00
|
||||
1 irgsp three_prime_UTR 187769 188020 . + . Parent=transcript:Os01t0103650-00
|
||||
1 irgsp exon 188060 188385 . + . Parent=transcript:Os01t0103650-00;Name=Os01t0103650-00.exon2;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0103650-00.exon2;rank=2
|
||||
1 irgsp three_prime_UTR 188060 188385 . + . Parent=transcript:Os01t0103650-00
|
||||
1 irgsp exon 188455 188586 . + . Parent=transcript:Os01t0103650-00;Name=Os01t0103650-00.exon3;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0103650-00.exon3;rank=3
|
||||
1 irgsp three_prime_UTR 188455 188586 . + . Parent=transcript:Os01t0103650-00
|
||||
###
|
||||
1 irgsp gene 191037 196287 . + . ID=gene:Os01g0103700;biotype=protein_coding;description=Conserved hypothetical protein. (Os01t0103700-01);gene_id=Os01g0103700;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 191037 196287 . + . ID=transcript:Os01t0103700-01;Parent=gene:Os01g0103700;biotype=protein_coding;transcript_id=Os01t0103700-01
|
||||
1 irgsp exon 191037 191161 . + . Parent=transcript:Os01t0103700-01;Name=Os01t0103700-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0103700-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 191037 191161 . + . Parent=transcript:Os01t0103700-01
|
||||
1 irgsp five_prime_UTR 191625 191693 . + . Parent=transcript:Os01t0103700-01
|
||||
1 irgsp exon 191625 191705 . + . Parent=transcript:Os01t0103700-01;Name=Os01t0103700-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0103700-01.exon2;rank=2
|
||||
1 irgsp CDS 191694 191705 . + 0 ID=CDS:Os01t0103700-01;Parent=transcript:Os01t0103700-01;protein_id=Os01t0103700-01
|
||||
1 irgsp exon 192399 192506 . + . Parent=transcript:Os01t0103700-01;Name=Os01t0103700-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0103700-01.exon3;rank=3
|
||||
1 irgsp CDS 192399 192506 . + 0 ID=CDS:Os01t0103700-01;Parent=transcript:Os01t0103700-01;protein_id=Os01t0103700-01
|
||||
1 irgsp exon 192958 193161 . + . Parent=transcript:Os01t0103700-01;Name=Os01t0103700-01.exon4;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0103700-01.exon4;rank=4
|
||||
1 irgsp CDS 192958 193161 . + 0 ID=CDS:Os01t0103700-01;Parent=transcript:Os01t0103700-01;protein_id=Os01t0103700-01
|
||||
1 irgsp exon 193248 193356 . + . Parent=transcript:Os01t0103700-01;Name=Os01t0103700-01.exon5;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0103700-01.exon5;rank=5
|
||||
1 irgsp CDS 193248 193356 . + 0 ID=CDS:Os01t0103700-01;Parent=transcript:Os01t0103700-01;protein_id=Os01t0103700-01
|
||||
1 irgsp CDS 193434 193507 . + 2 ID=CDS:Os01t0103700-01;Parent=transcript:Os01t0103700-01;protein_id=Os01t0103700-01
|
||||
1 irgsp exon 193434 196287 . + . Parent=transcript:Os01t0103700-01;Name=Os01t0103700-01.exon6;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0103700-01.exon6;rank=6
|
||||
1 irgsp three_prime_UTR 193508 196287 . + . Parent=transcript:Os01t0103700-01
|
||||
###
|
||||
1 irgsp gene 197647 200803 . + . ID=gene:Os01g0103800;Name=OsDW1-01g;biotype=protein_coding;description=Conserved hypothetical protein. (Os01t0103800-01);gene_id=Os01g0103800;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 197647 200803 . + . ID=transcript:Os01t0103800-01;Parent=gene:Os01g0103800;biotype=protein_coding;transcript_id=Os01t0103800-01
|
||||
1 irgsp exon 197647 197838 . + . Parent=transcript:Os01t0103800-01;Name=Os01t0103800-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0103800-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 197647 197838 . + . Parent=transcript:Os01t0103800-01
|
||||
1 irgsp five_prime_UTR 198034 198129 . + . Parent=transcript:Os01t0103800-01
|
||||
1 irgsp exon 198034 198225 . + . Parent=transcript:Os01t0103800-01;Name=Os01t0103800-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0103800-01.exon2;rank=2
|
||||
1 irgsp CDS 198130 198225 . + 0 ID=CDS:Os01t0103800-01;Parent=transcript:Os01t0103800-01;protein_id=Os01t0103800-01
|
||||
1 irgsp exon 198830 200036 . + . Parent=transcript:Os01t0103800-01;Name=Os01t0103800-01.exon3;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0103800-01.exon3;rank=3
|
||||
1 irgsp CDS 198830 200036 . + 0 ID=CDS:Os01t0103800-01;Parent=transcript:Os01t0103800-01;protein_id=Os01t0103800-01
|
||||
1 irgsp CDS 200253 200479 . + 2 ID=CDS:Os01t0103800-01;Parent=transcript:Os01t0103800-01;protein_id=Os01t0103800-01
|
||||
1 irgsp exon 200253 200803 . + . Parent=transcript:Os01t0103800-01;Name=Os01t0103800-01.exon4;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0103800-01.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 200480 200803 . + . Parent=transcript:Os01t0103800-01
|
||||
###
|
||||
1 irgsp gene 201944 206202 . + . ID=gene:Os01g0103900;biotype=protein_coding;description=Polynucleotidyl transferase%2C Ribonuclease H fold domain containing protein. (Os01t0103900-01);gene_id=Os01g0103900;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 201944 206202 . + . ID=transcript:Os01t0103900-01;Parent=gene:Os01g0103900;biotype=protein_coding;transcript_id=Os01t0103900-01
|
||||
1 irgsp five_prime_UTR 201944 202041 . + . Parent=transcript:Os01t0103900-01
|
||||
1 irgsp exon 201944 202110 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon1;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0103900-01.exon1;rank=1
|
||||
1 irgsp CDS 202042 202110 . + 0 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 202252 202359 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0103900-01.exon2;rank=2
|
||||
1 irgsp CDS 202252 202359 . + 0 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 203007 203127 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon3;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0103900-01.exon3;rank=3
|
||||
1 irgsp CDS 203007 203127 . + 0 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 203302 203429 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon4;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0103900-01.exon4;rank=4
|
||||
1 irgsp CDS 203302 203429 . + 2 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 203511 203658 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon5;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0103900-01.exon5;rank=5
|
||||
1 irgsp CDS 203511 203658 . + 0 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 203760 203938 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0103900-01.exon6;rank=6
|
||||
1 irgsp CDS 203760 203938 . + 2 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 204203 204440 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon7;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0103900-01.exon7;rank=7
|
||||
1 irgsp CDS 204203 204440 . + 0 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 204543 204635 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon8;constitutive=1;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0103900-01.exon8;rank=8
|
||||
1 irgsp CDS 204543 204635 . + 2 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 204730 204875 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon9;constitutive=1;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0103900-01.exon9;rank=9
|
||||
1 irgsp CDS 204730 204875 . + 2 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 205042 205149 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon10;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0103900-01.exon10;rank=10
|
||||
1 irgsp CDS 205042 205149 . + 0 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 205290 205378 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon11;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0103900-01.exon11;rank=11
|
||||
1 irgsp CDS 205290 205378 . + 0 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp CDS 205534 205543 . + 1 ID=CDS:Os01t0103900-01;Parent=transcript:Os01t0103900-01;protein_id=Os01t0103900-01
|
||||
1 irgsp exon 205534 206202 . + . Parent=transcript:Os01t0103900-01;Name=Os01t0103900-01.exon12;constitutive=1;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0103900-01.exon12;rank=12
|
||||
1 irgsp three_prime_UTR 205544 206202 . + . Parent=transcript:Os01t0103900-01
|
||||
###
|
||||
1 irgsp gene 206131 209606 . - . ID=gene:Os01g0104000;biotype=protein_coding;description=C-type lectin domain containing protein. (Os01t0104000-01)%3BSimilar to predicted protein. (Os01t0104000-02);gene_id=Os01g0104000;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 206131 209581 . - . ID=transcript:Os01t0104000-02;Parent=gene:Os01g0104000;biotype=protein_coding;transcript_id=Os01t0104000-02
|
||||
1 irgsp three_prime_UTR 206131 206449 . - . Parent=transcript:Os01t0104000-02
|
||||
1 irgsp exon 206131 207029 . - . Parent=transcript:Os01t0104000-02;Name=Os01t0104000-02.exon4;constitutive=0;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0104000-02.exon4;rank=4
|
||||
1 irgsp CDS 206450 207029 . - 1 ID=CDS:Os01t0104000-02;Parent=transcript:Os01t0104000-02;protein_id=Os01t0104000-02
|
||||
1 irgsp exon 207706 208273 . - . Parent=transcript:Os01t0104000-02;Name=Os01t0104000-02.exon3;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=Os01t0104000-02.exon3;rank=3
|
||||
1 irgsp CDS 207706 208273 . - 2 ID=CDS:Os01t0104000-02;Parent=transcript:Os01t0104000-02;protein_id=Os01t0104000-02
|
||||
1 irgsp exon 208408 208836 . - . Parent=transcript:Os01t0104000-02;Name=Os01t0104000-01.exon2;constitutive=1;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0104000-01.exon2;rank=2
|
||||
1 irgsp CDS 208408 208836 . - 2 ID=CDS:Os01t0104000-02;Parent=transcript:Os01t0104000-02;protein_id=Os01t0104000-02
|
||||
1 irgsp CDS 209438 209525 . - 0 ID=CDS:Os01t0104000-02;Parent=transcript:Os01t0104000-02;protein_id=Os01t0104000-02
|
||||
1 irgsp exon 209438 209581 . - . Parent=transcript:Os01t0104000-02;Name=Os01t0104000-02.exon1;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0104000-02.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 209526 209581 . - . Parent=transcript:Os01t0104000-02
|
||||
1 irgsp mRNA 206134 209606 . - . ID=transcript:Os01t0104000-01;Parent=gene:Os01g0104000;biotype=protein_coding;transcript_id=Os01t0104000-01
|
||||
1 irgsp three_prime_UTR 206134 206449 . - . Parent=transcript:Os01t0104000-01
|
||||
1 irgsp exon 206134 207029 . - . Parent=transcript:Os01t0104000-01;Name=Os01t0104000-01.exon4;constitutive=0;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0104000-01.exon4;rank=4
|
||||
1 irgsp CDS 206450 207029 . - 1 ID=CDS:Os01t0104000-01;Parent=transcript:Os01t0104000-01;protein_id=Os01t0104000-01
|
||||
1 irgsp exon 207706 208276 . - . Parent=transcript:Os01t0104000-01;Name=Os01t0104000-01.exon3;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=Os01t0104000-01.exon3;rank=3
|
||||
1 irgsp CDS 207706 208276 . - 2 ID=CDS:Os01t0104000-01;Parent=transcript:Os01t0104000-01;protein_id=Os01t0104000-01
|
||||
1 irgsp exon 208408 208836 . - . Parent=transcript:Os01t0104000-01;Name=Os01t0104000-01.exon2;constitutive=1;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0104000-01.exon2;rank=2
|
||||
1 irgsp CDS 208408 208836 . - 2 ID=CDS:Os01t0104000-01;Parent=transcript:Os01t0104000-01;protein_id=Os01t0104000-01
|
||||
1 irgsp CDS 209438 209525 . - 0 ID=CDS:Os01t0104000-01;Parent=transcript:Os01t0104000-01;protein_id=Os01t0104000-01
|
||||
1 irgsp exon 209438 209606 . - . Parent=transcript:Os01t0104000-01;Name=Os01t0104000-01.exon1;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0104000-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 209526 209606 . - . Parent=transcript:Os01t0104000-01
|
||||
###
|
||||
1 irgsp gene 209771 214173 . + . ID=gene:Os01g0104100;Name=cold-inducible%2C cold-inducible zinc finger protein;biotype=protein_coding;description=Similar to protein binding / zinc ion binding. (Os01t0104100-01)%3BSimilar to protein binding / zinc ion binding. (Os01t0104100-02);gene_id=Os01g0104100;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 209771 214173 . + . ID=transcript:Os01t0104100-01;Parent=gene:Os01g0104100;biotype=protein_coding;transcript_id=Os01t0104100-01
|
||||
1 irgsp exon 209771 209896 . + . Parent=transcript:Os01t0104100-01;Name=Os01t0104100-01.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104100-01.exon1;rank=1
|
||||
1 irgsp CDS 209771 209896 . + 0 ID=CDS:Os01t0104100-01;Parent=transcript:Os01t0104100-01;protein_id=Os01t0104100-01
|
||||
1 irgsp exon 210244 210563 . + . Parent=transcript:Os01t0104100-01;Name=Os01t0104100-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104100-01.exon2;rank=2
|
||||
1 irgsp CDS 210244 210563 . + 0 ID=CDS:Os01t0104100-01;Parent=transcript:Os01t0104100-01;protein_id=Os01t0104100-01
|
||||
1 irgsp exon 210659 210890 . + . Parent=transcript:Os01t0104100-01;Name=Os01t0104100-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104100-01.exon3;rank=3
|
||||
1 irgsp CDS 210659 210890 . + 1 ID=CDS:Os01t0104100-01;Parent=transcript:Os01t0104100-01;protein_id=Os01t0104100-01
|
||||
1 irgsp exon 211015 211160 . + . Parent=transcript:Os01t0104100-01;Name=Os01t0104100-01.exon4;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104100-01.exon4;rank=4
|
||||
1 irgsp CDS 211015 211160 . + 0 ID=CDS:Os01t0104100-01;Parent=transcript:Os01t0104100-01;protein_id=Os01t0104100-01
|
||||
1 irgsp exon 212265 212352 . + . Parent=transcript:Os01t0104100-01;Name=Os01t0104100-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104100-01.exon5;rank=5
|
||||
1 irgsp CDS 212265 212352 . + 1 ID=CDS:Os01t0104100-01;Parent=transcript:Os01t0104100-01;protein_id=Os01t0104100-01
|
||||
1 irgsp exon 212433 212579 . + . Parent=transcript:Os01t0104100-01;Name=Os01t0104100-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104100-01.exon6;rank=6
|
||||
1 irgsp CDS 212433 212579 . + 0 ID=CDS:Os01t0104100-01;Parent=transcript:Os01t0104100-01;protein_id=Os01t0104100-01
|
||||
1 irgsp exon 213490 213639 . + . Parent=transcript:Os01t0104100-01;Name=Os01t0104100-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104100-01.exon7;rank=7
|
||||
1 irgsp CDS 213490 213639 . + 0 ID=CDS:Os01t0104100-01;Parent=transcript:Os01t0104100-01;protein_id=Os01t0104100-01
|
||||
1 irgsp CDS 213741 213788 . + 0 ID=CDS:Os01t0104100-01;Parent=transcript:Os01t0104100-01;protein_id=Os01t0104100-01
|
||||
1 irgsp exon 213741 214173 . + . Parent=transcript:Os01t0104100-01;Name=Os01t0104100-01.exon8;constitutive=0;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0104100-01.exon8;rank=8
|
||||
1 irgsp three_prime_UTR 213789 214173 . + . Parent=transcript:Os01t0104100-01
|
||||
1 irgsp mRNA 209794 214147 . + . ID=transcript:Os01t0104100-02;Parent=gene:Os01g0104100;biotype=protein_coding;transcript_id=Os01t0104100-02
|
||||
1 irgsp five_prime_UTR 209794 209794 . + . Parent=transcript:Os01t0104100-02
|
||||
1 irgsp exon 209794 209896 . + . Parent=transcript:Os01t0104100-02;Name=Os01t0104100-02.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0104100-02.exon1;rank=1
|
||||
1 irgsp CDS 209795 209896 . + 0 ID=CDS:Os01t0104100-02;Parent=transcript:Os01t0104100-02;protein_id=Os01t0104100-02
|
||||
1 irgsp exon 210244 210563 . + . Parent=transcript:Os01t0104100-02;Name=Os01t0104100-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104100-01.exon2;rank=2
|
||||
1 irgsp CDS 210244 210563 . + 0 ID=CDS:Os01t0104100-02;Parent=transcript:Os01t0104100-02;protein_id=Os01t0104100-02
|
||||
1 irgsp exon 210659 210890 . + . Parent=transcript:Os01t0104100-02;Name=Os01t0104100-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104100-01.exon3;rank=3
|
||||
1 irgsp CDS 210659 210890 . + 1 ID=CDS:Os01t0104100-02;Parent=transcript:Os01t0104100-02;protein_id=Os01t0104100-02
|
||||
1 irgsp exon 211015 211160 . + . Parent=transcript:Os01t0104100-02;Name=Os01t0104100-01.exon4;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104100-01.exon4;rank=4
|
||||
1 irgsp CDS 211015 211160 . + 0 ID=CDS:Os01t0104100-02;Parent=transcript:Os01t0104100-02;protein_id=Os01t0104100-02
|
||||
1 irgsp exon 212265 212352 . + . Parent=transcript:Os01t0104100-02;Name=Os01t0104100-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104100-01.exon5;rank=5
|
||||
1 irgsp CDS 212265 212352 . + 1 ID=CDS:Os01t0104100-02;Parent=transcript:Os01t0104100-02;protein_id=Os01t0104100-02
|
||||
1 irgsp exon 212433 212579 . + . Parent=transcript:Os01t0104100-02;Name=Os01t0104100-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104100-01.exon6;rank=6
|
||||
1 irgsp CDS 212433 212579 . + 0 ID=CDS:Os01t0104100-02;Parent=transcript:Os01t0104100-02;protein_id=Os01t0104100-02
|
||||
1 irgsp exon 213490 213639 . + . Parent=transcript:Os01t0104100-02;Name=Os01t0104100-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104100-01.exon7;rank=7
|
||||
1 irgsp CDS 213490 213639 . + 0 ID=CDS:Os01t0104100-02;Parent=transcript:Os01t0104100-02;protein_id=Os01t0104100-02
|
||||
1 irgsp CDS 213741 213788 . + 0 ID=CDS:Os01t0104100-02;Parent=transcript:Os01t0104100-02;protein_id=Os01t0104100-02
|
||||
1 irgsp exon 213741 214147 . + . Parent=transcript:Os01t0104100-02;Name=Os01t0104100-02.exon8;constitutive=0;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0104100-02.exon8;rank=8
|
||||
1 irgsp three_prime_UTR 213789 214147 . + . Parent=transcript:Os01t0104100-02
|
||||
###
|
||||
1 irgsp gene 216212 217345 . + . ID=gene:Os01g0104200;Name=NAC DOMAIN-CONTAINING PROTEIN 16;biotype=protein_coding;description=No apical meristem (NAM) protein domain containing protein. (Os01t0104200-00);gene_id=Os01g0104200;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 216212 217345 . + . ID=transcript:Os01t0104200-00;Parent=gene:Os01g0104200;biotype=protein_coding;transcript_id=Os01t0104200-00
|
||||
1 irgsp exon 216212 216769 . + . Parent=transcript:Os01t0104200-00;Name=Os01t0104200-00.exon1;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104200-00.exon1;rank=1
|
||||
1 irgsp CDS 216212 216769 . + 0 ID=CDS:Os01t0104200-00;Parent=transcript:Os01t0104200-00;protein_id=Os01t0104200-00
|
||||
1 irgsp exon 216884 217345 . + . Parent=transcript:Os01t0104200-00;Name=Os01t0104200-00.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104200-00.exon2;rank=2
|
||||
1 irgsp CDS 216884 217345 . + 0 ID=CDS:Os01t0104200-00;Parent=transcript:Os01t0104200-00;protein_id=Os01t0104200-00
|
||||
###
|
||||
1 irgsp gene 226897 229301 . + . ID=gene:Os01g0104400;biotype=protein_coding;description=Ricin B-related lectin domain containing protein. (Os01t0104400-01)%3BRicin B-related lectin domain containing protein. (Os01t0104400-02)%3BRicin B-related lectin domain containing protein. (Os01t0104400-03);gene_id=Os01g0104400;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 226897 229229 . + . ID=transcript:Os01t0104400-01;Parent=gene:Os01g0104400;biotype=protein_coding;transcript_id=Os01t0104400-01
|
||||
1 irgsp five_prime_UTR 226897 227181 . + . Parent=transcript:Os01t0104400-01
|
||||
1 irgsp exon 226897 227634 . + . Parent=transcript:Os01t0104400-01;Name=Os01t0104400-01.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0104400-01.exon1;rank=1
|
||||
1 irgsp CDS 227182 227634 . + 0 ID=CDS:Os01t0104400-01;Parent=transcript:Os01t0104400-01;protein_id=Os01t0104400-01
|
||||
1 irgsp exon 227742 227864 . + . Parent=transcript:Os01t0104400-01;Name=Os01t0104400-03.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104400-03.exon2;rank=2
|
||||
1 irgsp CDS 227742 227864 . + 0 ID=CDS:Os01t0104400-01;Parent=transcript:Os01t0104400-01;protein_id=Os01t0104400-01
|
||||
1 irgsp exon 228557 228785 . + . Parent=transcript:Os01t0104400-01;Name=Os01t0104400-03.exon3;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0104400-03.exon3;rank=3
|
||||
1 irgsp CDS 228557 228785 . + 0 ID=CDS:Os01t0104400-01;Parent=transcript:Os01t0104400-01;protein_id=Os01t0104400-01
|
||||
1 irgsp CDS 228930 228931 . + 2 ID=CDS:Os01t0104400-01;Parent=transcript:Os01t0104400-01;protein_id=Os01t0104400-01
|
||||
1 irgsp exon 228930 229229 . + . Parent=transcript:Os01t0104400-01;Name=Os01t0104400-01.exon4;constitutive=0;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0104400-01.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 228932 229229 . + . Parent=transcript:Os01t0104400-01
|
||||
1 irgsp mRNA 227139 229301 . + . ID=transcript:Os01t0104400-02;Parent=gene:Os01g0104400;biotype=protein_coding;transcript_id=Os01t0104400-02
|
||||
1 irgsp five_prime_UTR 227139 227181 . + . Parent=transcript:Os01t0104400-02
|
||||
1 irgsp exon 227139 227634 . + . Parent=transcript:Os01t0104400-02;Name=Os01t0104400-02.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0104400-02.exon1;rank=1
|
||||
1 irgsp CDS 227182 227634 . + 0 ID=CDS:Os01t0104400-02;Parent=transcript:Os01t0104400-02;protein_id=Os01t0104400-02
|
||||
1 irgsp exon 227742 227864 . + . Parent=transcript:Os01t0104400-02;Name=Os01t0104400-03.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104400-03.exon2;rank=2
|
||||
1 irgsp CDS 227742 227864 . + 0 ID=CDS:Os01t0104400-02;Parent=transcript:Os01t0104400-02;protein_id=Os01t0104400-02
|
||||
1 irgsp exon 228557 228785 . + . Parent=transcript:Os01t0104400-02;Name=Os01t0104400-03.exon3;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0104400-03.exon3;rank=3
|
||||
1 irgsp CDS 228557 228785 . + 0 ID=CDS:Os01t0104400-02;Parent=transcript:Os01t0104400-02;protein_id=Os01t0104400-02
|
||||
1 irgsp CDS 228930 228931 . + 2 ID=CDS:Os01t0104400-02;Parent=transcript:Os01t0104400-02;protein_id=Os01t0104400-02
|
||||
1 irgsp exon 228930 229301 . + . Parent=transcript:Os01t0104400-02;Name=Os01t0104400-02.exon4;constitutive=0;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0104400-02.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 228932 229301 . + . Parent=transcript:Os01t0104400-02
|
||||
1 irgsp mRNA 227179 229214 . + . ID=transcript:Os01t0104400-03;Parent=gene:Os01g0104400;biotype=protein_coding;transcript_id=Os01t0104400-03
|
||||
1 irgsp five_prime_UTR 227179 227181 . + . Parent=transcript:Os01t0104400-03
|
||||
1 irgsp exon 227179 227634 . + . Parent=transcript:Os01t0104400-03;Name=Os01t0104400-03.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0104400-03.exon1;rank=1
|
||||
1 irgsp CDS 227182 227634 . + 0 ID=CDS:Os01t0104400-03;Parent=transcript:Os01t0104400-03;protein_id=Os01t0104400-03
|
||||
1 irgsp exon 227742 227864 . + . Parent=transcript:Os01t0104400-03;Name=Os01t0104400-03.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104400-03.exon2;rank=2
|
||||
1 irgsp CDS 227742 227864 . + 0 ID=CDS:Os01t0104400-03;Parent=transcript:Os01t0104400-03;protein_id=Os01t0104400-03
|
||||
1 irgsp exon 228557 228785 . + . Parent=transcript:Os01t0104400-03;Name=Os01t0104400-03.exon3;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0104400-03.exon3;rank=3
|
||||
1 irgsp CDS 228557 228785 . + 0 ID=CDS:Os01t0104400-03;Parent=transcript:Os01t0104400-03;protein_id=Os01t0104400-03
|
||||
1 irgsp CDS 228930 228931 . + 2 ID=CDS:Os01t0104400-03;Parent=transcript:Os01t0104400-03;protein_id=Os01t0104400-03
|
||||
1 irgsp exon 228930 229214 . + . Parent=transcript:Os01t0104400-03;Name=Os01t0104400-03.exon4;constitutive=0;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0104400-03.exon4;rank=4
|
||||
1 irgsp three_prime_UTR 228932 229214 . + . Parent=transcript:Os01t0104400-03
|
||||
###
|
||||
1 irgsp gene 241680 243440 . + . ID=gene:Os01g0104500;Name=NAC DOMAIN-CONTAINING PROTEIN 20;biotype=protein_coding;description=No apical meristem (NAM) protein domain containing protein. (Os01t0104500-01);gene_id=Os01g0104500;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 241680 243440 . + . ID=transcript:Os01t0104500-01;Parent=gene:Os01g0104500;biotype=protein_coding;transcript_id=Os01t0104500-01
|
||||
1 irgsp exon 241680 241702 . + . Parent=transcript:Os01t0104500-01;Name=Os01t0104500-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0104500-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 241680 241702 . + . Parent=transcript:Os01t0104500-01
|
||||
1 irgsp five_prime_UTR 241866 241907 . + . Parent=transcript:Os01t0104500-01
|
||||
1 irgsp exon 241866 242091 . + . Parent=transcript:Os01t0104500-01;Name=Os01t0104500-01.exon2;constitutive=1;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0104500-01.exon2;rank=2
|
||||
1 irgsp CDS 241908 242091 . + 0 ID=CDS:Os01t0104500-01;Parent=transcript:Os01t0104500-01;protein_id=Os01t0104500-01
|
||||
1 irgsp CDS 242199 242977 . + 2 ID=CDS:Os01t0104500-01;Parent=transcript:Os01t0104500-01;protein_id=Os01t0104500-01
|
||||
1 irgsp exon 242199 243440 . + . Parent=transcript:Os01t0104500-01;Name=Os01t0104500-01.exon3;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0104500-01.exon3;rank=3
|
||||
1 irgsp three_prime_UTR 242978 243440 . + . Parent=transcript:Os01t0104500-01
|
||||
###
|
||||
1 irgsp gene 248828 256872 . - . ID=gene:Os01g0104600;Name=DE-ETIOLATED1;biotype=protein_coding;description=Homolog of Arabidopsis DE-ETIOLATED1 (DET1)%2C Modulation of the ABA signaling pathway and ABA biosynthesis%2C Regulation of chlorophyll content (Os01t0104600-01)%3BSimilar to Light-mediated development protein DET1 (Deetiolated1 homolog) (tDET1) (High pigmentation protein 2) (Protein dark green). (Os01t0104600-02);gene_id=Os01g0104600;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 248828 256571 . - . ID=transcript:Os01t0104600-02;Parent=gene:Os01g0104600;biotype=protein_coding;transcript_id=Os01t0104600-02
|
||||
1 irgsp three_prime_UTR 248828 248970 . - . Parent=transcript:Os01t0104600-02
|
||||
1 irgsp exon 248828 249107 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon11;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0104600-01.exon11;rank=11
|
||||
1 irgsp CDS 248971 249107 . - 2 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 249369 249468 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon10;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0104600-01.exon10;rank=10
|
||||
1 irgsp CDS 249369 249468 . - 0 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 249861 249956 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon9;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon9;rank=9
|
||||
1 irgsp CDS 249861 249956 . - 0 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 250617 250781 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon8;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon8;rank=8
|
||||
1 irgsp CDS 250617 250781 . - 0 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 250860 250940 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon7;rank=7
|
||||
1 irgsp CDS 250860 250940 . - 0 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 251026 251082 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon6;rank=6
|
||||
1 irgsp CDS 251026 251082 . - 0 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 251316 251384 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon5;rank=5
|
||||
1 irgsp CDS 251316 251384 . - 0 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 251695 251790 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon4;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon4;rank=4
|
||||
1 irgsp CDS 251695 251790 . - 0 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 255325 255553 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104600-01.exon3;rank=3
|
||||
1 irgsp CDS 255325 255553 . - 1 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 255674 256098 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104600-01.exon2;rank=2
|
||||
1 irgsp CDS 255674 256098 . - 0 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp CDS 256361 256441 . - 0 ID=CDS:Os01t0104600-02;Parent=transcript:Os01t0104600-02;protein_id=Os01t0104600-02
|
||||
1 irgsp exon 256361 256571 . - . Parent=transcript:Os01t0104600-02;Name=Os01t0104600-02.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0104600-02.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 256442 256571 . - . Parent=transcript:Os01t0104600-02
|
||||
1 irgsp mRNA 248828 256872 . - . ID=transcript:Os01t0104600-01;Parent=gene:Os01g0104600;biotype=protein_coding;transcript_id=Os01t0104600-01
|
||||
1 irgsp three_prime_UTR 248828 248970 . - . Parent=transcript:Os01t0104600-01
|
||||
1 irgsp exon 248828 249107 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon11;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0104600-01.exon11;rank=11
|
||||
1 irgsp CDS 248971 249107 . - 2 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 249369 249468 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon10;constitutive=1;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0104600-01.exon10;rank=10
|
||||
1 irgsp CDS 249369 249468 . - 0 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 249861 249956 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon9;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon9;rank=9
|
||||
1 irgsp CDS 249861 249956 . - 0 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 250617 250781 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon8;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon8;rank=8
|
||||
1 irgsp CDS 250617 250781 . - 0 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 250860 250940 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon7;rank=7
|
||||
1 irgsp CDS 250860 250940 . - 0 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 251026 251082 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon6;rank=6
|
||||
1 irgsp CDS 251026 251082 . - 0 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 251316 251384 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon5;rank=5
|
||||
1 irgsp CDS 251316 251384 . - 0 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 251695 251790 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon4;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104600-01.exon4;rank=4
|
||||
1 irgsp CDS 251695 251790 . - 0 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 255325 255553 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104600-01.exon3;rank=3
|
||||
1 irgsp CDS 255325 255553 . - 1 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 255674 256098 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon2;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104600-01.exon2;rank=2
|
||||
1 irgsp CDS 255674 256098 . - 0 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp CDS 256361 256441 . - 0 ID=CDS:Os01t0104600-01;Parent=transcript:Os01t0104600-01;protein_id=Os01t0104600-01
|
||||
1 irgsp exon 256361 256872 . - . Parent=transcript:Os01t0104600-01;Name=Os01t0104600-01.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0104600-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 256442 256872 . - . Parent=transcript:Os01t0104600-01
|
||||
###
|
||||
1 irgsp gene 261530 268145 . + . ID=gene:Os01g0104800;biotype=protein_coding;description=Sas10/Utp3 family protein. (Os01t0104800-01)%3BHypothetical conserved gene. (Os01t0104800-02);gene_id=Os01g0104800;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 261530 268145 . + . ID=transcript:Os01t0104800-01;Parent=gene:Os01g0104800;biotype=protein_coding;transcript_id=Os01t0104800-01
|
||||
1 irgsp five_prime_UTR 261530 261561 . + . Parent=transcript:Os01t0104800-01
|
||||
1 irgsp exon 261530 261661 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon1;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0104800-01.exon1;rank=1
|
||||
1 irgsp CDS 261562 261661 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 261767 261805 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon2;constitutive=0;ensembl_end_phase=1;ensembl_phase=1;exon_id=Os01t0104800-01.exon2;rank=2
|
||||
1 irgsp CDS 261767 261805 . + 2 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 261895 261941 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon3;constitutive=0;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0104800-01.exon3;rank=3
|
||||
1 irgsp CDS 261895 261941 . + 2 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 262582 262681 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon4;constitutive=0;ensembl_end_phase=1;ensembl_phase=0;exon_id=Os01t0104800-01.exon4;rank=4
|
||||
1 irgsp CDS 262582 262681 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 262925 263181 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon5;constitutive=0;ensembl_end_phase=0;ensembl_phase=1;exon_id=Os01t0104800-01.exon5;rank=5
|
||||
1 irgsp CDS 262925 263181 . + 2 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 263525 263640 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon6;constitutive=0;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104800-01.exon6;rank=6
|
||||
1 irgsp CDS 263525 263640 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 264014 264098 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104800-01.exon7;rank=7
|
||||
1 irgsp CDS 264014 264098 . + 1 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 265236 265415 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon8;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104800-01.exon8;rank=8
|
||||
1 irgsp CDS 265236 265415 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 265506 265649 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon9;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104800-01.exon9;rank=9
|
||||
1 irgsp CDS 265506 265649 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 265740 265817 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon10;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104800-01.exon10;rank=10
|
||||
1 irgsp CDS 265740 265817 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 265909 266045 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon11;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104800-01.exon11;rank=11
|
||||
1 irgsp CDS 265909 266045 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 266138 266246 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon12;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104800-01.exon12;rank=12
|
||||
1 irgsp CDS 266138 266246 . + 1 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 267237 267514 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon13;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104800-01.exon13;rank=13
|
||||
1 irgsp CDS 267237 267514 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 267591 267657 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon14;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104800-01.exon14;rank=14
|
||||
1 irgsp CDS 267591 267657 . + 1 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 267734 267802 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon15;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104800-01.exon15;rank=15
|
||||
1 irgsp CDS 267734 267802 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp CDS 267880 268011 . + 0 ID=CDS:Os01t0104800-01;Parent=transcript:Os01t0104800-01;protein_id=Os01t0104800-01
|
||||
1 irgsp exon 267880 268145 . + . Parent=transcript:Os01t0104800-01;Name=Os01t0104800-01.exon16;constitutive=0;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0104800-01.exon16;rank=16
|
||||
1 irgsp three_prime_UTR 268012 268145 . + . Parent=transcript:Os01t0104800-01
|
||||
1 irgsp mRNA 263523 268120 . + . ID=transcript:Os01t0104800-02;Parent=gene:Os01g0104800;biotype=protein_coding;transcript_id=Os01t0104800-02
|
||||
1 irgsp five_prime_UTR 263523 263524 . + . Parent=transcript:Os01t0104800-02
|
||||
1 irgsp exon 263523 263640 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-02.exon1;constitutive=0;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0104800-02.exon1;rank=1
|
||||
1 irgsp CDS 263525 263640 . + 0 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 264014 264098 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104800-01.exon7;rank=2
|
||||
1 irgsp CDS 264014 264098 . + 1 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 265236 265415 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-01.exon8;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104800-01.exon8;rank=3
|
||||
1 irgsp CDS 265236 265415 . + 0 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 265506 265649 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-01.exon9;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104800-01.exon9;rank=4
|
||||
1 irgsp CDS 265506 265649 . + 0 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 265740 265817 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-01.exon10;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104800-01.exon10;rank=5
|
||||
1 irgsp CDS 265740 265817 . + 0 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 265909 266045 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-01.exon11;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104800-01.exon11;rank=6
|
||||
1 irgsp CDS 265909 266045 . + 0 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 266138 266246 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-01.exon12;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104800-01.exon12;rank=7
|
||||
1 irgsp CDS 266138 266246 . + 1 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 267237 267514 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-01.exon13;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0104800-01.exon13;rank=8
|
||||
1 irgsp CDS 267237 267514 . + 0 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 267591 267657 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-01.exon14;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0104800-01.exon14;rank=9
|
||||
1 irgsp CDS 267591 267657 . + 1 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 267734 267802 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-01.exon15;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0104800-01.exon15;rank=10
|
||||
1 irgsp CDS 267734 267802 . + 0 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp CDS 267880 268011 . + 0 ID=CDS:Os01t0104800-02;Parent=transcript:Os01t0104800-02;protein_id=Os01t0104800-02
|
||||
1 irgsp exon 267880 268120 . + . Parent=transcript:Os01t0104800-02;Name=Os01t0104800-02.exon11;constitutive=0;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0104800-02.exon11;rank=11
|
||||
1 irgsp three_prime_UTR 268012 268120 . + . Parent=transcript:Os01t0104800-02
|
||||
###
|
||||
1 irgsp gene 270179 275084 . - . ID=gene:Os01g0104900;biotype=protein_coding;description=Transferase family protein. (Os01t0104900-01)%3BHypothetical conserved gene. (Os01t0104900-02);gene_id=Os01g0104900;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 270179 275084 . - . ID=transcript:Os01t0104900-01;Parent=gene:Os01g0104900;biotype=protein_coding;transcript_id=Os01t0104900-01
|
||||
1 irgsp three_prime_UTR 270179 270355 . - . Parent=transcript:Os01t0104900-01
|
||||
1 irgsp exon 270179 271333 . - . Parent=transcript:Os01t0104900-01;Name=Os01t0104900-01.exon2;constitutive=0;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0104900-01.exon2;rank=2
|
||||
1 irgsp CDS 270356 271333 . - 0 ID=CDS:Os01t0104900-01;Parent=transcript:Os01t0104900-01;protein_id=Os01t0104900-01
|
||||
1 irgsp CDS 274529 274957 . - 0 ID=CDS:Os01t0104900-01;Parent=transcript:Os01t0104900-01;protein_id=Os01t0104900-01
|
||||
1 irgsp exon 274529 275084 . - . Parent=transcript:Os01t0104900-01;Name=Os01t0104900-01.exon1;constitutive=0;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0104900-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 274958 275084 . - . Parent=transcript:Os01t0104900-01
|
||||
1 irgsp mRNA 270250 271518 . - . ID=transcript:Os01t0104900-02;Parent=gene:Os01g0104900;biotype=protein_coding;transcript_id=Os01t0104900-02
|
||||
1 irgsp three_prime_UTR 270250 270355 . - . Parent=transcript:Os01t0104900-02
|
||||
1 irgsp exon 270250 271333 . - . Parent=transcript:Os01t0104900-02;Name=Os01t0104900-02.exon2;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0104900-02.exon2;rank=2
|
||||
1 irgsp CDS 270356 271309 . - 0 ID=CDS:Os01t0104900-02;Parent=transcript:Os01t0104900-02;protein_id=Os01t0104900-02
|
||||
1 irgsp five_prime_UTR 271310 271333 . - . Parent=transcript:Os01t0104900-02
|
||||
1 irgsp exon 271457 271518 . - . Parent=transcript:Os01t0104900-02;Name=Os01t0104900-02.exon1;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0104900-02.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 271457 271518 . - . Parent=transcript:Os01t0104900-02
|
||||
###
|
||||
1 irgsp gene 284762 291892 . - . ID=gene:Os01g0105300;biotype=protein_coding;description=Similar to HAT family dimerisation domain containing protein%2C expressed. (Os01t0105300-01);gene_id=Os01g0105300;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 284762 291892 . - . ID=transcript:Os01t0105300-01;Parent=gene:Os01g0105300;biotype=protein_coding;transcript_id=Os01t0105300-01
|
||||
1 irgsp three_prime_UTR 284762 284930 . - . Parent=transcript:Os01t0105300-01
|
||||
1 irgsp exon 284762 287047 . - . Parent=transcript:Os01t0105300-01;Name=Os01t0105300-01.exon5;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105300-01.exon5;rank=5
|
||||
1 irgsp CDS 284931 285020 . - 0 ID=CDS:Os01t0105300-01;Parent=transcript:Os01t0105300-01;protein_id=Os01t0105300-01
|
||||
1 irgsp five_prime_UTR 285021 287047 . - . Parent=transcript:Os01t0105300-01
|
||||
1 irgsp exon 291398 291436 . - . Parent=transcript:Os01t0105300-01;Name=Os01t0105300-01.exon4;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105300-01.exon4;rank=4
|
||||
1 irgsp five_prime_UTR 291398 291436 . - . Parent=transcript:Os01t0105300-01
|
||||
1 irgsp exon 291520 291534 . - . Parent=transcript:Os01t0105300-01;Name=Os01t0105300-01.exon3;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105300-01.exon3;rank=3
|
||||
1 irgsp five_prime_UTR 291520 291534 . - . Parent=transcript:Os01t0105300-01
|
||||
1 irgsp exon 291678 291738 . - . Parent=transcript:Os01t0105300-01;Name=Os01t0105300-01.exon2;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105300-01.exon2;rank=2
|
||||
1 irgsp five_prime_UTR 291678 291738 . - . Parent=transcript:Os01t0105300-01
|
||||
1 irgsp exon 291838 291892 . - . Parent=transcript:Os01t0105300-01;Name=Os01t0105300-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105300-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 291838 291892 . - . Parent=transcript:Os01t0105300-01
|
||||
###
|
||||
1 irgsp gene 288372 292296 . + . ID=gene:Os01g0105400;biotype=protein_coding;description=Similar to Kinesin heavy chain. (Os01t0105400-01);gene_id=Os01g0105400;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 288372 292296 . + . ID=transcript:Os01t0105400-01;Parent=gene:Os01g0105400;biotype=protein_coding;transcript_id=Os01t0105400-01
|
||||
1 irgsp exon 288372 288846 . + . Parent=transcript:Os01t0105400-01;Name=Os01t0105400-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105400-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 288372 288846 . + . Parent=transcript:Os01t0105400-01
|
||||
1 irgsp exon 288950 289116 . + . Parent=transcript:Os01t0105400-01;Name=Os01t0105400-01.exon2;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105400-01.exon2;rank=2
|
||||
1 irgsp five_prime_UTR 288950 289116 . + . Parent=transcript:Os01t0105400-01
|
||||
1 irgsp exon 289202 289572 . + . Parent=transcript:Os01t0105400-01;Name=Os01t0105400-01.exon3;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105400-01.exon3;rank=3
|
||||
1 irgsp five_prime_UTR 289202 289572 . + . Parent=transcript:Os01t0105400-01
|
||||
1 irgsp exon 289661 289830 . + . Parent=transcript:Os01t0105400-01;Name=Os01t0105400-01.exon4;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105400-01.exon4;rank=4
|
||||
1 irgsp five_prime_UTR 289661 289830 . + . Parent=transcript:Os01t0105400-01
|
||||
1 irgsp five_prime_UTR 290395 290432 . + . Parent=transcript:Os01t0105400-01
|
||||
1 irgsp exon 290395 290512 . + . Parent=transcript:Os01t0105400-01;Name=Os01t0105400-01.exon5;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0105400-01.exon5;rank=5
|
||||
1 irgsp CDS 290433 290512 . + 0 ID=CDS:Os01t0105400-01;Parent=transcript:Os01t0105400-01;protein_id=Os01t0105400-01
|
||||
1 irgsp CDS 291372 291558 . + 1 ID=CDS:Os01t0105400-01;Parent=transcript:Os01t0105400-01;protein_id=Os01t0105400-01
|
||||
1 irgsp exon 291372 291574 . + . Parent=transcript:Os01t0105400-01;Name=Os01t0105400-01.exon6;constitutive=1;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0105400-01.exon6;rank=6
|
||||
1 irgsp three_prime_UTR 291559 291574 . + . Parent=transcript:Os01t0105400-01
|
||||
1 irgsp exon 291648 291779 . + . Parent=transcript:Os01t0105400-01;Name=Os01t0105400-01.exon7;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105400-01.exon7;rank=7
|
||||
1 irgsp three_prime_UTR 291648 291779 . + . Parent=transcript:Os01t0105400-01
|
||||
1 irgsp exon 291859 291948 . + . Parent=transcript:Os01t0105400-01;Name=Os01t0105400-01.exon8;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105400-01.exon8;rank=8
|
||||
1 irgsp three_prime_UTR 291859 291948 . + . Parent=transcript:Os01t0105400-01
|
||||
1 irgsp exon 292073 292296 . + . Parent=transcript:Os01t0105400-01;Name=Os01t0105400-01.exon9;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105400-01.exon9;rank=9
|
||||
1 irgsp three_prime_UTR 292073 292296 . + . Parent=transcript:Os01t0105400-01
|
||||
###
|
||||
1 irgsp gene 303233 306736 . + . ID=gene:Os01g0105700;Name=basic helix-loop-helix protein 071;biotype=protein_coding;description=Basic helix-loop-helix dimerisation region bHLH domain containing protein. (Os01t0105700-01);gene_id=Os01g0105700;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 303233 306736 . + . ID=transcript:Os01t0105700-01;Parent=gene:Os01g0105700;biotype=protein_coding;transcript_id=Os01t0105700-01
|
||||
1 irgsp five_prime_UTR 303233 303328 . + . Parent=transcript:Os01t0105700-01
|
||||
1 irgsp exon 303233 303471 . + . Parent=transcript:Os01t0105700-01;Name=Os01t0105700-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0105700-01.exon1;rank=1
|
||||
1 irgsp CDS 303329 303471 . + 0 ID=CDS:Os01t0105700-01;Parent=transcript:Os01t0105700-01;protein_id=Os01t0105700-01
|
||||
1 irgsp exon 303981 304509 . + . Parent=transcript:Os01t0105700-01;Name=Os01t0105700-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0105700-01.exon2;rank=2
|
||||
1 irgsp CDS 303981 304509 . + 1 ID=CDS:Os01t0105700-01;Parent=transcript:Os01t0105700-01;protein_id=Os01t0105700-01
|
||||
1 irgsp exon 305572 305718 . + . Parent=transcript:Os01t0105700-01;Name=Os01t0105700-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0105700-01.exon3;rank=3
|
||||
1 irgsp CDS 305572 305718 . + 0 ID=CDS:Os01t0105700-01;Parent=transcript:Os01t0105700-01;protein_id=Os01t0105700-01
|
||||
1 irgsp exon 305834 305899 . + . Parent=transcript:Os01t0105700-01;Name=Os01t0105700-01.exon4;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0105700-01.exon4;rank=4
|
||||
1 irgsp CDS 305834 305899 . + 0 ID=CDS:Os01t0105700-01;Parent=transcript:Os01t0105700-01;protein_id=Os01t0105700-01
|
||||
1 irgsp exon 305993 306058 . + . Parent=transcript:Os01t0105700-01;Name=Os01t0105700-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0105700-01.exon5;rank=5
|
||||
1 irgsp CDS 305993 306058 . + 0 ID=CDS:Os01t0105700-01;Parent=transcript:Os01t0105700-01;protein_id=Os01t0105700-01
|
||||
1 irgsp exon 306171 306245 . + . Parent=transcript:Os01t0105700-01;Name=Os01t0105700-01.exon6;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0105700-01.exon6;rank=6
|
||||
1 irgsp CDS 306171 306245 . + 0 ID=CDS:Os01t0105700-01;Parent=transcript:Os01t0105700-01;protein_id=Os01t0105700-01
|
||||
1 irgsp CDS 306353 306493 . + 0 ID=CDS:Os01t0105700-01;Parent=transcript:Os01t0105700-01;protein_id=Os01t0105700-01
|
||||
1 irgsp exon 306353 306736 . + . Parent=transcript:Os01t0105700-01;Name=Os01t0105700-01.exon7;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0105700-01.exon7;rank=7
|
||||
1 irgsp three_prime_UTR 306494 306736 . + . Parent=transcript:Os01t0105700-01
|
||||
###
|
||||
1 irgsp gene 306871 308842 . - . ID=gene:Os01g0105800;Name=IRON-SULFUR CLUSTER PROTEIN 9;biotype=protein_coding;description=Similar to Iron sulfur assembly protein 1. (Os01t0105800-01);gene_id=Os01g0105800;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 306871 308842 . - . ID=transcript:Os01t0105800-01;Parent=gene:Os01g0105800;biotype=protein_coding;transcript_id=Os01t0105800-01
|
||||
1 irgsp three_prime_UTR 306871 307123 . - . Parent=transcript:Os01t0105800-01
|
||||
1 irgsp exon 306871 307217 . - . Parent=transcript:Os01t0105800-01;Name=Os01t0105800-01.exon4;constitutive=1;ensembl_end_phase=-1;ensembl_phase=2;exon_id=Os01t0105800-01.exon4;rank=4
|
||||
1 irgsp CDS 307124 307217 . - 1 ID=CDS:Os01t0105800-01;Parent=transcript:Os01t0105800-01;protein_id=Os01t0105800-01
|
||||
1 irgsp exon 307296 307413 . - . Parent=transcript:Os01t0105800-01;Name=Os01t0105800-01.exon3;constitutive=1;ensembl_end_phase=2;ensembl_phase=1;exon_id=Os01t0105800-01.exon3;rank=3
|
||||
1 irgsp CDS 307296 307413 . - 2 ID=CDS:Os01t0105800-01;Parent=transcript:Os01t0105800-01;protein_id=Os01t0105800-01
|
||||
1 irgsp CDS 308397 308601 . - 0 ID=CDS:Os01t0105800-01;Parent=transcript:Os01t0105800-01;protein_id=Os01t0105800-01
|
||||
1 irgsp exon 308397 308626 . - . Parent=transcript:Os01t0105800-01;Name=Os01t0105800-01.exon2;constitutive=1;ensembl_end_phase=1;ensembl_phase=-1;exon_id=Os01t0105800-01.exon2;rank=2
|
||||
1 irgsp five_prime_UTR 308602 308626 . - . Parent=transcript:Os01t0105800-01
|
||||
1 irgsp exon 308703 308842 . - . Parent=transcript:Os01t0105800-01;Name=Os01t0105800-01.exon1;constitutive=1;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=Os01t0105800-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 308703 308842 . - . Parent=transcript:Os01t0105800-01
|
||||
###
|
||||
1 irgsp gene 309520 313170 . - . ID=gene:Os01g0105900;biotype=protein_coding;description=Carbohydrate/purine kinase domain containing protein. (Os01t0105900-01);gene_id=Os01g0105900;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 309520 313170 . - . ID=transcript:Os01t0105900-01;Parent=gene:Os01g0105900;biotype=protein_coding;transcript_id=Os01t0105900-01
|
||||
1 irgsp three_prime_UTR 309520 309821 . - . Parent=transcript:Os01t0105900-01
|
||||
1 irgsp exon 309520 310070 . - . Parent=transcript:Os01t0105900-01;Name=Os01t0105900-01.exon8;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0105900-01.exon8;rank=8
|
||||
1 irgsp CDS 309822 310070 . - 0 ID=CDS:Os01t0105900-01;Parent=transcript:Os01t0105900-01;protein_id=Os01t0105900-01
|
||||
1 irgsp exon 310256 310367 . - . Parent=transcript:Os01t0105900-01;Name=Os01t0105900-01.exon7;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0105900-01.exon7;rank=7
|
||||
1 irgsp CDS 310256 310367 . - 1 ID=CDS:Os01t0105900-01;Parent=transcript:Os01t0105900-01;protein_id=Os01t0105900-01
|
||||
1 irgsp exon 310455 310552 . - . Parent=transcript:Os01t0105900-01;Name=Os01t0105900-01.exon6;constitutive=1;ensembl_end_phase=2;ensembl_phase=0;exon_id=Os01t0105900-01.exon6;rank=6
|
||||
1 irgsp CDS 310455 310552 . - 0 ID=CDS:Os01t0105900-01;Parent=transcript:Os01t0105900-01;protein_id=Os01t0105900-01
|
||||
1 irgsp exon 310632 310739 . - . Parent=transcript:Os01t0105900-01;Name=Os01t0105900-01.exon5;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0105900-01.exon5;rank=5
|
||||
1 irgsp CDS 310632 310739 . - 0 ID=CDS:Os01t0105900-01;Parent=transcript:Os01t0105900-01;protein_id=Os01t0105900-01
|
||||
1 irgsp exon 310880 310918 . - . Parent=transcript:Os01t0105900-01;Name=Os01t0105900-01.exon4;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0105900-01.exon4;rank=4
|
||||
1 irgsp CDS 310880 310918 . - 0 ID=CDS:Os01t0105900-01;Parent=transcript:Os01t0105900-01;protein_id=Os01t0105900-01
|
||||
1 irgsp exon 311002 311073 . - . Parent=transcript:Os01t0105900-01;Name=Os01t0105900-01.exon3;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0105900-01.exon3;rank=3
|
||||
1 irgsp CDS 311002 311073 . - 0 ID=CDS:Os01t0105900-01;Parent=transcript:Os01t0105900-01;protein_id=Os01t0105900-01
|
||||
1 irgsp exon 311163 311426 . - . Parent=transcript:Os01t0105900-01;Name=Os01t0105900-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=Os01t0105900-01.exon2;rank=2
|
||||
1 irgsp CDS 311163 311426 . - 0 ID=CDS:Os01t0105900-01;Parent=transcript:Os01t0105900-01;protein_id=Os01t0105900-01
|
||||
1 irgsp CDS 312867 313064 . - 0 ID=CDS:Os01t0105900-01;Parent=transcript:Os01t0105900-01;protein_id=Os01t0105900-01
|
||||
1 irgsp exon 312867 313170 . - . Parent=transcript:Os01t0105900-01;Name=Os01t0105900-01.exon1;constitutive=1;ensembl_end_phase=0;ensembl_phase=-1;exon_id=Os01t0105900-01.exon1;rank=1
|
||||
1 irgsp five_prime_UTR 313065 313170 . - . Parent=transcript:Os01t0105900-01
|
||||
###
|
||||
1 irgsp gene 319754 322205 . + . ID=gene:Os01g0106200;biotype=protein_coding;description=Similar to RER1A protein (AtRER1A). (Os01t0106200-01);gene_id=Os01g0106200;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 319754 322205 . + . ID=transcript:Os01t0106200-01;Parent=gene:Os01g0106200;biotype=protein_coding;transcript_id=Os01t0106200-01
|
||||
1 irgsp five_prime_UTR 319754 319874 . + . Parent=transcript:Os01t0106200-01
|
||||
1 irgsp exon 319754 320236 . + . Parent=transcript:Os01t0106200-01;Name=Os01t0106200-01.exon1;constitutive=1;ensembl_end_phase=2;ensembl_phase=-1;exon_id=Os01t0106200-01.exon1;rank=1
|
||||
1 irgsp CDS 319875 320236 . + 0 ID=CDS:Os01t0106200-01;Parent=transcript:Os01t0106200-01;protein_id=Os01t0106200-01
|
||||
1 irgsp exon 321468 321648 . + . Parent=transcript:Os01t0106200-01;Name=Os01t0106200-01.exon2;constitutive=1;ensembl_end_phase=0;ensembl_phase=2;exon_id=Os01t0106200-01.exon2;rank=2
|
||||
1 irgsp CDS 321468 321648 . + 1 ID=CDS:Os01t0106200-01;Parent=transcript:Os01t0106200-01;protein_id=Os01t0106200-01
|
||||
1 irgsp CDS 321928 321975 . + 0 ID=CDS:Os01t0106200-01;Parent=transcript:Os01t0106200-01;protein_id=Os01t0106200-01
|
||||
1 irgsp exon 321928 322205 . + . Parent=transcript:Os01t0106200-01;Name=Os01t0106200-01.exon3;constitutive=1;ensembl_end_phase=-1;ensembl_phase=0;exon_id=Os01t0106200-01.exon3;rank=3
|
||||
1 irgsp three_prime_UTR 321976 322205 . + . Parent=transcript:Os01t0106200-01
|
||||
###
|
||||
1 irgsp gene 322591 323923 . - . ID=gene:Os01g0106300;biotype=protein_coding;description=Similar to Isoflavone reductase homolog IRL (EC 1.3.1.-). (Os01t0106300-01);gene_id=Os01g0106300;logic_name=irgspv1.0-20170804-genes
|
||||
1 irgsp mRNA 322591 323923 . - . ID=transcript:Os01t0106300-01;Parent=gene:Os01g0106300;biotype=protein_coding;transcript_id=Os01t0106300-01
|
||||
1 irgsp three_prime_UTR 322591 322809 . - . Parent=transcript:Os01t0106300-01
|
||||
1 irgsp exon 322591 322973 . - . Parent=transcript:Os01t0106300-01;Name=Os01t0106300-01.exon2;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Os01t0106300-01.exon2;rank=2
|
||||
@@ -0,0 +1,881 @@
|
||||
seq_id source_tag primary_tag start end score strand frame Alias biotype constitutive description ensembl_end_phase ensembl_phase exon_id gene_id ID logic_name Name Parent protein_id rank transcript_id
|
||||
1 irgsp repeat_region 2000 2100 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A fakeRepeat1 N/A N/A N/A N/A N/A N/A
|
||||
1 irgsp gene 2983 10815 . 1 . N/A protein_coding N/A RabGAP/TBC domain containing protein. (Os01t0100100-01) N/A N/A N/A Os01g0100100 gene:Os01g0100100 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 2983 10815 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100100-01 N/A N/A gene:Os01g0100100 N/A N/A Os01t0100100-01
|
||||
1 irgsp exon 2983 3268 . 1 . N/A N/A 1 N/A -1 -1 Os01t0100100-01.exon1 N/A Os01t0100100-01.exon1 N/A Os01t0100100-01.exon1 transcript:Os01t0100100-01 N/A 1 N/A
|
||||
1 irgsp exon 3354 3616 . 1 . N/A N/A 1 N/A 0 -1 Os01t0100100-01.exon2 N/A Os01t0100100-01.exon2 N/A Os01t0100100-01.exon2 transcript:Os01t0100100-01 N/A 2 N/A
|
||||
1 irgsp exon 4357 4455 . 1 . N/A N/A 1 N/A 0 0 Os01t0100100-01.exon3 N/A Os01t0100100-01.exon3 N/A Os01t0100100-01.exon3 transcript:Os01t0100100-01 N/A 3 N/A
|
||||
1 irgsp exon 5457 5560 . 1 . N/A N/A 1 N/A 2 0 Os01t0100100-01.exon4 N/A Os01t0100100-01.exon4 N/A Os01t0100100-01.exon4 transcript:Os01t0100100-01 N/A 4 N/A
|
||||
1 irgsp exon 7136 7944 . 1 . N/A N/A 1 N/A 1 2 Os01t0100100-01.exon5 N/A Os01t0100100-01.exon5 N/A Os01t0100100-01.exon5 transcript:Os01t0100100-01 N/A 5 N/A
|
||||
1 irgsp exon 8028 8150 . 1 . N/A N/A 1 N/A 1 1 Os01t0100100-01.exon6 N/A Os01t0100100-01.exon6 N/A Os01t0100100-01.exon6 transcript:Os01t0100100-01 N/A 6 N/A
|
||||
1 irgsp exon 8232 8320 . 1 . N/A N/A 1 N/A 0 1 Os01t0100100-01.exon7 N/A Os01t0100100-01.exon7 N/A Os01t0100100-01.exon7 transcript:Os01t0100100-01 N/A 7 N/A
|
||||
1 irgsp exon 8408 8608 . 1 . N/A N/A 1 N/A 0 0 Os01t0100100-01.exon8 N/A Os01t0100100-01.exon8 N/A Os01t0100100-01.exon8 transcript:Os01t0100100-01 N/A 8 N/A
|
||||
1 irgsp exon 9210 9615 . 1 . N/A N/A 1 N/A 1 0 Os01t0100100-01.exon9 N/A Os01t0100100-01.exon9 N/A Os01t0100100-01.exon9 transcript:Os01t0100100-01 N/A 9 N/A
|
||||
1 irgsp exon 10102 10187 . 1 . N/A N/A 1 N/A 0 1 Os01t0100100-01.exon10 N/A Os01t0100100-01.exon10 N/A Os01t0100100-01.exon10 transcript:Os01t0100100-01 N/A 10 N/A
|
||||
1 irgsp exon 10274 10430 . 1 . N/A N/A 1 N/A -1 0 Os01t0100100-01.exon11 N/A Os01t0100100-01.exon11 N/A Os01t0100100-01.exon11 transcript:Os01t0100100-01 N/A 11 N/A
|
||||
1 irgsp exon 10504 10815 . 1 . N/A N/A 1 N/A -1 -1 Os01t0100100-01.exon12 N/A Os01t0100100-01.exon12 N/A Os01t0100100-01.exon12 transcript:Os01t0100100-01 N/A 12 N/A
|
||||
1 irgsp CDS 3449 3616 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp CDS 4357 4455 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp CDS 5457 5560 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp CDS 7136 7944 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp CDS 8028 8150 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp CDS 8232 8320 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp CDS 8408 8608 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp CDS 9210 9615 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp CDS 10102 10187 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp CDS 10274 10297 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100100-01 N/A N/A transcript:Os01t0100100-01 Os01t0100100-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 2983 3268 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-1 N/A N/A transcript:Os01t0100100-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 3354 3448 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-2 N/A N/A transcript:Os01t0100100-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 10298 10430 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-1 N/A N/A transcript:Os01t0100100-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 10504 10815 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-2 N/A N/A transcript:Os01t0100100-01 N/A N/A N/A
|
||||
1 irgsp gene 11218 12435 . 1 . N/A protein_coding N/A Conserved hypothetical protein. (Os01t0100200-01) N/A N/A N/A Os01g0100200 gene:Os01g0100200 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 11218 12435 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100200-01 N/A N/A gene:Os01g0100200 N/A N/A Os01t0100200-01
|
||||
1 irgsp exon 11218 12060 . 1 . N/A N/A 1 N/A 2 -1 Os01t0100200-01.exon1 N/A Os01t0100200-01.exon1 N/A Os01t0100200-01.exon1 transcript:Os01t0100200-01 N/A 1 N/A
|
||||
1 irgsp exon 12152 12435 . 1 . N/A N/A 1 N/A -1 2 Os01t0100200-01.exon2 N/A Os01t0100200-01.exon2 N/A Os01t0100200-01.exon2 transcript:Os01t0100200-01 N/A 2 N/A
|
||||
1 irgsp CDS 11798 12060 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100200-01 N/A N/A transcript:Os01t0100200-01 Os01t0100200-01 N/A N/A
|
||||
1 irgsp CDS 12152 12317 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100200-01 N/A N/A transcript:Os01t0100200-01 Os01t0100200-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 11218 11797 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-3 N/A N/A transcript:Os01t0100200-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 12318 12435 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-3 N/A N/A transcript:Os01t0100200-01 N/A N/A N/A
|
||||
1 irgsp gene 11372 12284 . -1 . N/A protein_coding N/A Cytochrome P450 domain containing protein. (Os01t0100300-00) N/A N/A N/A Os01g0100300 gene:Os01g0100300 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 11372 12284 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100300-00 N/A N/A gene:Os01g0100300 N/A N/A Os01t0100300-00
|
||||
1 irgsp exon 11372 12042 . -1 . N/A N/A 1 N/A 0 1 Os01t0100300-00.exon2 N/A Os01t0100300-00.exon2 N/A Os01t0100300-00.exon2 transcript:Os01t0100300-00 N/A 2 N/A
|
||||
1 irgsp exon 12146 12284 . -1 . N/A N/A 1 N/A 1 0 Os01t0100300-00.exon1 N/A Os01t0100300-00.exon1 N/A Os01t0100300-00.exon1 transcript:Os01t0100300-00 N/A 1 N/A
|
||||
1 irgsp CDS 11372 12042 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100300-00 N/A N/A transcript:Os01t0100300-00 Os01t0100300-00 N/A N/A
|
||||
1 irgsp CDS 12146 12284 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100300-00 N/A N/A transcript:Os01t0100300-00 Os01t0100300-00 N/A N/A
|
||||
1 irgsp gene 12721 15685 . 1 . N/A protein_coding N/A Similar to Pectinesterase-like protein. (Os01t0100400-01) N/A N/A N/A Os01g0100400 gene:Os01g0100400 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 12721 15685 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100400-01 N/A N/A gene:Os01g0100400 N/A N/A Os01t0100400-01
|
||||
1 irgsp exon 12721 13813 . 1 . N/A N/A 1 N/A 2 -1 Os01t0100400-01.exon1 N/A Os01t0100400-01.exon1 N/A Os01t0100400-01.exon1 transcript:Os01t0100400-01 N/A 1 N/A
|
||||
1 irgsp exon 13906 14271 . 1 . N/A N/A 1 N/A 2 2 Os01t0100400-01.exon2 N/A Os01t0100400-01.exon2 N/A Os01t0100400-01.exon2 transcript:Os01t0100400-01 N/A 2 N/A
|
||||
1 irgsp exon 14359 14437 . 1 . N/A N/A 1 N/A 0 2 Os01t0100400-01.exon3 N/A Os01t0100400-01.exon3 N/A Os01t0100400-01.exon3 transcript:Os01t0100400-01 N/A 3 N/A
|
||||
1 irgsp exon 14969 15171 . 1 . N/A N/A 1 N/A 2 0 Os01t0100400-01.exon4 N/A Os01t0100400-01.exon4 N/A Os01t0100400-01.exon4 transcript:Os01t0100400-01 N/A 4 N/A
|
||||
1 irgsp exon 15266 15685 . 1 . N/A N/A 1 N/A -1 2 Os01t0100400-01.exon5 N/A Os01t0100400-01.exon5 N/A Os01t0100400-01.exon5 transcript:Os01t0100400-01 N/A 5 N/A
|
||||
1 irgsp CDS 12774 13813 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100400-01 N/A N/A transcript:Os01t0100400-01 Os01t0100400-01 N/A N/A
|
||||
1 irgsp CDS 13906 14271 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100400-01 N/A N/A transcript:Os01t0100400-01 Os01t0100400-01 N/A N/A
|
||||
1 irgsp CDS 14359 14437 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100400-01 N/A N/A transcript:Os01t0100400-01 Os01t0100400-01 N/A N/A
|
||||
1 irgsp CDS 14969 15171 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100400-01 N/A N/A transcript:Os01t0100400-01 Os01t0100400-01 N/A N/A
|
||||
1 irgsp CDS 15266 15359 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100400-01 N/A N/A transcript:Os01t0100400-01 Os01t0100400-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 12721 12773 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-4 N/A N/A transcript:Os01t0100400-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 15360 15685 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-4 N/A N/A transcript:Os01t0100400-01 N/A N/A N/A
|
||||
1 irgsp gene 12808 13978 . -1 . N/A protein_coding N/A Hypothetical protein. (Os01t0100466-00) N/A N/A N/A Os01g0100466 gene:Os01g0100466 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 12808 13978 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100466-00 N/A N/A gene:Os01g0100466 N/A N/A Os01t0100466-00
|
||||
1 irgsp exon 12808 13782 . -1 . N/A N/A 1 N/A -1 -1 Os01t0100466-00.exon2 N/A Os01t0100466-00.exon2 N/A Os01t0100466-00.exon2 transcript:Os01t0100466-00 N/A 2 N/A
|
||||
1 irgsp exon 13880 13978 . -1 . N/A N/A 1 N/A -1 -1 Os01t0100466-00.exon1 N/A Os01t0100466-00.exon1 N/A Os01t0100466-00.exon1 transcript:Os01t0100466-00 N/A 1 N/A
|
||||
1 irgsp CDS 12869 13102 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100466-00 N/A N/A transcript:Os01t0100466-00 Os01t0100466-00 N/A N/A
|
||||
1 irgsp five_prime_UTR 13103 13782 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-5 N/A N/A transcript:Os01t0100466-00 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 13880 13978 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-6 N/A N/A transcript:Os01t0100466-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 12808 12868 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-5 N/A N/A transcript:Os01t0100466-00 N/A N/A N/A
|
||||
1 irgsp gene 16399 20144 . 1 . N/A protein_coding N/A Immunoglobulin-like domain containing protein. (Os01t0100500-01) N/A N/A N/A Os01g0100500 gene:Os01g0100500 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 16399 20144 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100500-01 N/A N/A gene:Os01g0100500 N/A N/A Os01t0100500-01
|
||||
1 irgsp exon 16399 16976 . 1 . N/A N/A 1 N/A 0 -1 Os01t0100500-01.exon1 N/A Os01t0100500-01.exon1 N/A Os01t0100500-01.exon1 transcript:Os01t0100500-01 N/A 1 N/A
|
||||
1 irgsp exon 17383 17474 . 1 . N/A N/A 1 N/A 2 0 Os01t0100500-01.exon2 N/A Os01t0100500-01.exon2 N/A Os01t0100500-01.exon2 transcript:Os01t0100500-01 N/A 2 N/A
|
||||
1 irgsp exon 17558 18258 . 1 . N/A N/A 1 N/A 1 2 Os01t0100500-01.exon3 N/A Os01t0100500-01.exon3 N/A Os01t0100500-01.exon3 transcript:Os01t0100500-01 N/A 3 N/A
|
||||
1 irgsp exon 18501 18571 . 1 . N/A N/A 1 N/A 0 1 Os01t0100500-01.exon4 N/A Os01t0100500-01.exon4 N/A Os01t0100500-01.exon4 transcript:Os01t0100500-01 N/A 4 N/A
|
||||
1 irgsp exon 18968 19057 . 1 . N/A N/A 1 N/A 0 0 Os01t0100500-01.exon5 N/A Os01t0100500-01.exon5 N/A Os01t0100500-01.exon5 transcript:Os01t0100500-01 N/A 5 N/A
|
||||
1 irgsp exon 19142 19321 . 1 . N/A N/A 1 N/A 0 0 Os01t0100500-01.exon6 N/A Os01t0100500-01.exon6 N/A Os01t0100500-01.exon6 transcript:Os01t0100500-01 N/A 6 N/A
|
||||
1 irgsp exon 19531 19629 . 1 . N/A N/A 1 N/A -1 0 Os01t0100500-01.exon7 N/A Os01t0100500-01.exon7 N/A Os01t0100500-01.exon7 transcript:Os01t0100500-01 N/A 7 N/A
|
||||
1 irgsp exon 19734 20144 . 1 . N/A N/A 1 N/A -1 -1 Os01t0100500-01.exon8 N/A Os01t0100500-01.exon8 N/A Os01t0100500-01.exon8 transcript:Os01t0100500-01 N/A 8 N/A
|
||||
1 irgsp CDS 16599 16976 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100500-01 N/A N/A transcript:Os01t0100500-01 Os01t0100500-01 N/A N/A
|
||||
1 irgsp CDS 17383 17474 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100500-01 N/A N/A transcript:Os01t0100500-01 Os01t0100500-01 N/A N/A
|
||||
1 irgsp CDS 17558 18258 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100500-01 N/A N/A transcript:Os01t0100500-01 Os01t0100500-01 N/A N/A
|
||||
1 irgsp CDS 18501 18571 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100500-01 N/A N/A transcript:Os01t0100500-01 Os01t0100500-01 N/A N/A
|
||||
1 irgsp CDS 18968 19057 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100500-01 N/A N/A transcript:Os01t0100500-01 Os01t0100500-01 N/A N/A
|
||||
1 irgsp CDS 19142 19321 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100500-01 N/A N/A transcript:Os01t0100500-01 Os01t0100500-01 N/A N/A
|
||||
1 irgsp CDS 19531 19593 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100500-01 N/A N/A transcript:Os01t0100500-01 Os01t0100500-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 16399 16598 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-7 N/A N/A transcript:Os01t0100500-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 19594 19629 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-6 N/A N/A transcript:Os01t0100500-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 19734 20144 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-7 N/A N/A transcript:Os01t0100500-01 N/A N/A N/A
|
||||
1 irgsp gene 22841 26892 . 1 . N/A protein_coding N/A Single-stranded nucleic acid binding R3H domain containing protein. (Os01t0100600-01) N/A N/A N/A Os01g0100600 gene:Os01g0100600 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 22841 26892 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100600-01 N/A N/A gene:Os01g0100600 N/A N/A Os01t0100600-01
|
||||
1 irgsp exon 22841 23281 . 1 . N/A N/A 1 N/A 2 -1 Os01t0100600-01.exon1 N/A Os01t0100600-01.exon1 N/A Os01t0100600-01.exon1 transcript:Os01t0100600-01 N/A 1 N/A
|
||||
1 irgsp exon 23572 23847 . 1 . N/A N/A 1 N/A 2 2 Os01t0100600-01.exon2 N/A Os01t0100600-01.exon2 N/A Os01t0100600-01.exon2 transcript:Os01t0100600-01 N/A 2 N/A
|
||||
1 irgsp exon 23962 24033 . 1 . N/A N/A 1 N/A 2 2 Os01t0100600-01.exon3 N/A Os01t0100600-01.exon3 N/A Os01t0100600-01.exon3 transcript:Os01t0100600-01 N/A 3 N/A
|
||||
1 irgsp exon 24492 24577 . 1 . N/A N/A 1 N/A 1 2 Os01t0100600-01.exon4 N/A Os01t0100600-01.exon4 N/A Os01t0100600-01.exon4 transcript:Os01t0100600-01 N/A 4 N/A
|
||||
1 irgsp exon 25445 25519 . 1 . N/A N/A 1 N/A 1 1 Os01t0100600-01.exon5 N/A Os01t0100600-01.exon5 N/A Os01t0100600-01.exon5 transcript:Os01t0100600-01 N/A 5 N/A
|
||||
1 irgsp exon 25883 26892 . 1 . N/A N/A 1 N/A -1 1 Os01t0100600-01.exon6 N/A Os01t0100600-01.exon6 N/A Os01t0100600-01.exon6 transcript:Os01t0100600-01 N/A 6 N/A
|
||||
1 irgsp CDS 23232 23281 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100600-01 N/A N/A transcript:Os01t0100600-01 Os01t0100600-01 N/A N/A
|
||||
1 irgsp CDS 23572 23847 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100600-01 N/A N/A transcript:Os01t0100600-01 Os01t0100600-01 N/A N/A
|
||||
1 irgsp CDS 23962 24033 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100600-01 N/A N/A transcript:Os01t0100600-01 Os01t0100600-01 N/A N/A
|
||||
1 irgsp CDS 24492 24577 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100600-01 N/A N/A transcript:Os01t0100600-01 Os01t0100600-01 N/A N/A
|
||||
1 irgsp CDS 25445 25519 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100600-01 N/A N/A transcript:Os01t0100600-01 Os01t0100600-01 N/A N/A
|
||||
1 irgsp CDS 25883 26391 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100600-01 N/A N/A transcript:Os01t0100600-01 Os01t0100600-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 22841 23231 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-8 N/A N/A transcript:Os01t0100600-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 26392 26892 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-8 N/A N/A transcript:Os01t0100600-01 N/A N/A N/A
|
||||
1 irgsp gene 25861 26424 . -1 . N/A protein_coding N/A Hypothetical gene. (Os01t0100650-00) N/A N/A N/A Os01g0100650 gene:Os01g0100650 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 25861 26424 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100650-00 N/A N/A gene:Os01g0100650 N/A N/A Os01t0100650-00
|
||||
1 irgsp exon 25861 26424 . -1 . N/A N/A 1 N/A -1 -1 Os01t0100650-00.exon1 N/A Os01t0100650-00.exon1 N/A Os01t0100650-00.exon1 transcript:Os01t0100650-00 N/A 1 N/A
|
||||
1 irgsp CDS 26040 26423 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100650-00 N/A N/A transcript:Os01t0100650-00 Os01t0100650-00 N/A N/A
|
||||
1 irgsp five_prime_UTR 26424 26424 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-9 N/A N/A transcript:Os01t0100650-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 25861 26039 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-9 N/A N/A transcript:Os01t0100650-00 N/A N/A N/A
|
||||
1 irgsp gene 27143 28644 . 1 . N/A protein_coding N/A Similar to 40S ribosomal protein S5-1. (Os01t0100700-01) N/A N/A N/A Os01g0100700 gene:Os01g0100700 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 27143 28644 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100700-01 N/A N/A gene:Os01g0100700 N/A N/A Os01t0100700-01
|
||||
1 irgsp exon 27143 27292 . 1 . N/A N/A 1 N/A 0 -1 Os01t0100700-01.exon1 N/A Os01t0100700-01.exon1 N/A Os01t0100700-01.exon1 transcript:Os01t0100700-01 N/A 1 N/A
|
||||
1 irgsp exon 27370 27641 . 1 . N/A N/A 1 N/A 2 0 Os01t0100700-01.exon2 N/A Os01t0100700-01.exon2 N/A Os01t0100700-01.exon2 transcript:Os01t0100700-01 N/A 2 N/A
|
||||
1 irgsp exon 28090 28293 . 1 . N/A N/A 1 N/A 2 2 Os01t0100700-01.exon3 N/A Os01t0100700-01.exon3 N/A Os01t0100700-01.exon3 transcript:Os01t0100700-01 N/A 3 N/A
|
||||
1 irgsp exon 28365 28644 . 1 . N/A N/A 1 N/A -1 2 Os01t0100700-01.exon4 N/A Os01t0100700-01.exon4 N/A Os01t0100700-01.exon4 transcript:Os01t0100700-01 N/A 4 N/A
|
||||
1 irgsp CDS 27221 27292 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100700-01 N/A N/A transcript:Os01t0100700-01 Os01t0100700-01 N/A N/A
|
||||
1 irgsp CDS 27370 27641 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100700-01 N/A N/A transcript:Os01t0100700-01 Os01t0100700-01 N/A N/A
|
||||
1 irgsp CDS 28090 28293 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100700-01 N/A N/A transcript:Os01t0100700-01 Os01t0100700-01 N/A N/A
|
||||
1 irgsp CDS 28365 28419 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100700-01 N/A N/A transcript:Os01t0100700-01 Os01t0100700-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 27143 27220 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-10 N/A N/A transcript:Os01t0100700-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 28420 28644 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-10 N/A N/A transcript:Os01t0100700-01 N/A N/A N/A
|
||||
1 irgsp gene 29818 34453 . 1 . N/A protein_coding N/A Protein of unknown function DUF1664 family protein. (Os01t0100800-01) N/A N/A N/A Os01g0100800 gene:Os01g0100800 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 29818 34453 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100800-01 N/A N/A gene:Os01g0100800 N/A N/A Os01t0100800-01
|
||||
1 irgsp exon 29818 29976 . 1 . N/A N/A 1 N/A 1 -1 Os01t0100800-01.exon1 N/A Os01t0100800-01.exon1 N/A Os01t0100800-01.exon1 transcript:Os01t0100800-01 N/A 1 N/A
|
||||
1 irgsp exon 30146 30228 . 1 . N/A N/A 1 N/A 0 1 Os01t0100800-01.exon2 N/A Os01t0100800-01.exon2 N/A Os01t0100800-01.exon2 transcript:Os01t0100800-01 N/A 2 N/A
|
||||
1 irgsp exon 30735 30806 . 1 . N/A N/A 1 N/A 0 0 Os01t0100800-01.exon3 N/A Os01t0100800-01.exon3 N/A Os01t0100800-01.exon3 transcript:Os01t0100800-01 N/A 3 N/A
|
||||
1 irgsp exon 30885 30963 . 1 . N/A N/A 1 N/A 1 0 Os01t0100800-01.exon4 N/A Os01t0100800-01.exon4 N/A Os01t0100800-01.exon4 transcript:Os01t0100800-01 N/A 4 N/A
|
||||
1 irgsp exon 31258 31325 . 1 . N/A N/A 1 N/A 0 1 Os01t0100800-01.exon5 N/A Os01t0100800-01.exon5 N/A Os01t0100800-01.exon5 transcript:Os01t0100800-01 N/A 5 N/A
|
||||
1 irgsp exon 31505 31606 . 1 . N/A N/A 1 N/A 0 0 Os01t0100800-01.exon6 N/A Os01t0100800-01.exon6 N/A Os01t0100800-01.exon6 transcript:Os01t0100800-01 N/A 6 N/A
|
||||
1 irgsp exon 32377 32466 . 1 . N/A N/A 1 N/A 0 0 Os01t0100800-01.exon7 N/A Os01t0100800-01.exon7 N/A Os01t0100800-01.exon7 transcript:Os01t0100800-01 N/A 7 N/A
|
||||
1 irgsp exon 32542 32616 . 1 . N/A N/A 1 N/A 0 0 Os01t0100800-01.exon8 N/A Os01t0100800-01.exon8 N/A Os01t0100800-01.exon8 transcript:Os01t0100800-01 N/A 8 N/A
|
||||
1 irgsp exon 32712 32744 . 1 . N/A N/A 1 N/A 0 0 Os01t0100800-01.exon9 N/A Os01t0100800-01.exon9 N/A Os01t0100800-01.exon9 transcript:Os01t0100800-01 N/A 9 N/A
|
||||
1 irgsp exon 32828 32905 . 1 . N/A N/A 1 N/A 0 0 Os01t0100800-01.exon10 N/A Os01t0100800-01.exon10 N/A Os01t0100800-01.exon10 transcript:Os01t0100800-01 N/A 10 N/A
|
||||
1 irgsp exon 33274 33330 . 1 . N/A N/A 1 N/A 0 0 Os01t0100800-01.exon11 N/A Os01t0100800-01.exon11 N/A Os01t0100800-01.exon11 transcript:Os01t0100800-01 N/A 11 N/A
|
||||
1 irgsp exon 33400 33471 . 1 . N/A N/A 1 N/A 0 0 Os01t0100800-01.exon12 N/A Os01t0100800-01.exon12 N/A Os01t0100800-01.exon12 transcript:Os01t0100800-01 N/A 12 N/A
|
||||
1 irgsp exon 33543 33617 . 1 . N/A N/A 1 N/A 0 0 Os01t0100800-01.exon13 N/A Os01t0100800-01.exon13 N/A Os01t0100800-01.exon13 transcript:Os01t0100800-01 N/A 13 N/A
|
||||
1 irgsp exon 33975 34453 . 1 . N/A N/A 1 N/A -1 0 Os01t0100800-01.exon14 N/A Os01t0100800-01.exon14 N/A Os01t0100800-01.exon14 transcript:Os01t0100800-01 N/A 14 N/A
|
||||
1 irgsp CDS 29940 29976 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 30146 30228 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 30735 30806 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 30885 30963 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 31258 31325 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 31505 31606 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 32377 32466 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 32542 32616 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 32712 32744 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 32828 32905 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 33274 33330 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 33400 33471 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 33543 33617 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp CDS 33975 34124 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100800-01 N/A N/A transcript:Os01t0100800-01 Os01t0100800-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 29818 29939 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-11 N/A N/A transcript:Os01t0100800-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 34125 34453 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-11 N/A N/A transcript:Os01t0100800-01 N/A N/A N/A
|
||||
1 irgsp gene 35623 41136 . 1 . N/A protein_coding N/A Sphingosine-1-phosphate lyase, Disease resistance response (Os01t0100900-01) N/A N/A N/A Os01g0100900 gene:Os01g0100900 irgspv1.0-20170804-genes SPHINGOSINE-1-PHOSPHATE LYASE 1, Sphingosine-1-Phoshpate Lyase 1 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 35623 41136 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0100900-01 N/A N/A gene:Os01g0100900 N/A N/A Os01t0100900-01
|
||||
1 irgsp exon 35623 35939 . 1 . N/A N/A 1 N/A 2 -1 Os01t0100900-01.exon1 N/A Os01t0100900-01.exon1 N/A Os01t0100900-01.exon1 transcript:Os01t0100900-01 N/A 1 N/A
|
||||
1 irgsp exon 36027 36072 . 1 . N/A N/A 1 N/A 0 2 Os01t0100900-01.exon2 N/A Os01t0100900-01.exon2 N/A Os01t0100900-01.exon2 transcript:Os01t0100900-01 N/A 2 N/A
|
||||
1 irgsp exon 36517 36668 . 1 . N/A N/A 1 N/A 2 0 Os01t0100900-01.exon3 N/A Os01t0100900-01.exon3 N/A Os01t0100900-01.exon3 transcript:Os01t0100900-01 N/A 3 N/A
|
||||
1 irgsp exon 36818 36877 . 1 . N/A N/A 1 N/A 2 2 Os01t0100900-01.exon4 N/A Os01t0100900-01.exon4 N/A Os01t0100900-01.exon4 transcript:Os01t0100900-01 N/A 4 N/A
|
||||
1 irgsp exon 37594 37818 . 1 . N/A N/A 1 N/A 2 2 Os01t0100900-01.exon5 N/A Os01t0100900-01.exon5 N/A Os01t0100900-01.exon5 transcript:Os01t0100900-01 N/A 5 N/A
|
||||
1 irgsp exon 37892 38033 . 1 . N/A N/A 1 N/A 0 2 Os01t0100900-01.exon6 N/A Os01t0100900-01.exon6 N/A Os01t0100900-01.exon6 transcript:Os01t0100900-01 N/A 6 N/A
|
||||
1 irgsp exon 38276 38326 . 1 . N/A N/A 1 N/A 0 0 Os01t0100900-01.exon7 N/A Os01t0100900-01.exon7 N/A Os01t0100900-01.exon7 transcript:Os01t0100900-01 N/A 7 N/A
|
||||
1 irgsp exon 38434 38525 . 1 . N/A N/A 1 N/A 2 0 Os01t0100900-01.exon8 N/A Os01t0100900-01.exon8 N/A Os01t0100900-01.exon8 transcript:Os01t0100900-01 N/A 8 N/A
|
||||
1 irgsp exon 39319 39445 . 1 . N/A N/A 1 N/A 0 2 Os01t0100900-01.exon9 N/A Os01t0100900-01.exon9 N/A Os01t0100900-01.exon9 transcript:Os01t0100900-01 N/A 9 N/A
|
||||
1 irgsp exon 39553 39568 . 1 . N/A N/A 1 N/A 1 0 Os01t0100900-01.exon10 N/A Os01t0100900-01.exon10 N/A Os01t0100900-01.exon10 transcript:Os01t0100900-01 N/A 10 N/A
|
||||
1 irgsp exon 39939 40046 . 1 . N/A N/A 1 N/A 1 1 Os01t0100900-01.exon11 N/A Os01t0100900-01.exon11 N/A Os01t0100900-01.exon11 transcript:Os01t0100900-01 N/A 11 N/A
|
||||
1 irgsp exon 40135 40189 . 1 . N/A N/A 1 N/A 2 1 Os01t0100900-01.exon12 N/A Os01t0100900-01.exon12 N/A Os01t0100900-01.exon12 transcript:Os01t0100900-01 N/A 12 N/A
|
||||
1 irgsp exon 40456 40602 . 1 . N/A N/A 1 N/A 2 2 Os01t0100900-01.exon13 N/A Os01t0100900-01.exon13 N/A Os01t0100900-01.exon13 transcript:Os01t0100900-01 N/A 13 N/A
|
||||
1 irgsp exon 40703 40781 . 1 . N/A N/A 1 N/A 0 2 Os01t0100900-01.exon14 N/A Os01t0100900-01.exon14 N/A Os01t0100900-01.exon14 transcript:Os01t0100900-01 N/A 14 N/A
|
||||
1 irgsp exon 40885 41136 . 1 . N/A N/A 1 N/A -1 0 Os01t0100900-01.exon15 N/A Os01t0100900-01.exon15 N/A Os01t0100900-01.exon15 transcript:Os01t0100900-01 N/A 15 N/A
|
||||
1 irgsp CDS 35743 35939 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 36027 36072 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 36517 36668 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 36818 36877 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 37594 37818 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 37892 38033 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 38276 38326 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 38434 38525 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 39319 39445 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 39553 39568 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 39939 40046 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 40135 40189 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 40456 40602 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 40703 40781 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp CDS 40885 41007 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0100900-01 N/A N/A transcript:Os01t0100900-01 Os01t0100900-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 35623 35742 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-12 N/A N/A transcript:Os01t0100900-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 41008 41136 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-12 N/A N/A transcript:Os01t0100900-01 N/A N/A N/A
|
||||
1 irgsp gene 58658 61090 . 1 . N/A protein_coding N/A Hypothetical conserved gene. (Os01t0101150-00) N/A N/A N/A Os01g0101150 gene:Os01g0101150 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 58658 61090 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101150-00 N/A N/A gene:Os01g0101150 N/A N/A Os01t0101150-00
|
||||
1 irgsp exon 58658 61090 . 1 . N/A N/A 1 N/A 0 0 Os01t0101150-00.exon1 N/A Os01t0101150-00.exon1 N/A Os01t0101150-00.exon1 transcript:Os01t0101150-00 N/A 1 N/A
|
||||
1 irgsp CDS 58658 61090 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101150-00 N/A N/A transcript:Os01t0101150-00 Os01t0101150-00 N/A N/A
|
||||
1 irgsp gene 62060 65537 . 1 . N/A protein_coding N/A 2,3-diketo-5-methylthio-1-phosphopentane phosphatase domain containing protein. (Os01t0101200-01);2,3-diketo-5-methylthio-1-phosphopentane phosphatase domain containing protein. (Os01t0101200-02) N/A N/A N/A Os01g0101200 gene:Os01g0101200 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 62060 63576 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101200-01 N/A N/A gene:Os01g0101200 N/A N/A Os01t0101200-01
|
||||
1 irgsp exon 62060 62295 . 1 . N/A N/A 0 N/A 0 -1 Os01t0101200-01.exon1 N/A Os01t0101200-01.exon1 N/A Os01t0101200-01.exon1 transcript:Os01t0101200-01 N/A 1 N/A
|
||||
1 irgsp exon 62385 62905 . 1 . N/A N/A 1 N/A 2 0 Os01t0101200-02.exon2 N/A Os01t0101200-02.exon2 N/A Os01t0101200-02.exon2 transcript:Os01t0101200-01 N/A 2 N/A
|
||||
1 irgsp exon 62996 63114 . 1 . N/A N/A 1 N/A 1 2 Os01t0101200-02.exon3 N/A Os01t0101200-02.exon3 N/A Os01t0101200-02.exon3 transcript:Os01t0101200-01 N/A 3 N/A
|
||||
1 irgsp exon 63248 63576 . 1 . N/A N/A 0 N/A -1 1 Os01t0101200-01.exon4 N/A Os01t0101200-01.exon4 N/A Os01t0101200-01.exon4 transcript:Os01t0101200-01 N/A 4 N/A
|
||||
1 irgsp CDS 62104 62295 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101200-01 N/A N/A transcript:Os01t0101200-01 Os01t0101200-01 N/A N/A
|
||||
1 irgsp CDS 62385 62905 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101200-01 N/A N/A transcript:Os01t0101200-01 Os01t0101200-01 N/A N/A
|
||||
1 irgsp CDS 62996 63114 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101200-01 N/A N/A transcript:Os01t0101200-01 Os01t0101200-01 N/A N/A
|
||||
1 irgsp CDS 63248 63345 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101200-01 N/A N/A transcript:Os01t0101200-01 Os01t0101200-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 62060 62103 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-13 N/A N/A transcript:Os01t0101200-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 63346 63576 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-13 N/A N/A transcript:Os01t0101200-01 N/A N/A N/A
|
||||
1 irgsp mRNA 62112 65537 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101200-02 N/A N/A gene:Os01g0101200 N/A N/A Os01t0101200-02
|
||||
1 irgsp exon 62112 62295 . 1 . N/A N/A 0 N/A 0 -1 Os01t0101200-02.exon1 N/A Os01t0101200-02.exon1 N/A Os01t0101200-02.exon1 transcript:Os01t0101200-02 N/A 1 N/A
|
||||
1 irgsp exon 62385 62905 . 1 . N/A N/A 1 N/A 2 0 Os01t0101200-02.exon2 N/A agat-exon-1 N/A Os01t0101200-02.exon2 transcript:Os01t0101200-02 N/A 2 N/A
|
||||
1 irgsp exon 62996 63114 . 1 . N/A N/A 1 N/A 1 2 Os01t0101200-02.exon3 N/A agat-exon-2 N/A Os01t0101200-02.exon3 transcript:Os01t0101200-02 N/A 3 N/A
|
||||
1 irgsp exon 63248 65537 . 1 . N/A N/A 0 N/A -1 1 Os01t0101200-02.exon4 N/A Os01t0101200-02.exon4 N/A Os01t0101200-02.exon4 transcript:Os01t0101200-02 N/A 4 N/A
|
||||
1 irgsp CDS 62113 62295 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101200-02 N/A N/A transcript:Os01t0101200-02 Os01t0101200-02 N/A N/A
|
||||
1 irgsp CDS 62385 62905 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101200-02 N/A N/A transcript:Os01t0101200-02 Os01t0101200-02 N/A N/A
|
||||
1 irgsp CDS 62996 63114 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101200-02 N/A N/A transcript:Os01t0101200-02 Os01t0101200-02 N/A N/A
|
||||
1 irgsp CDS 63248 63345 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101200-02 N/A N/A transcript:Os01t0101200-02 Os01t0101200-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 62112 62112 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-14 N/A N/A transcript:Os01t0101200-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 63346 65537 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-14 N/A N/A transcript:Os01t0101200-02 N/A N/A N/A
|
||||
1 irgsp gene 63350 66302 . -1 . N/A protein_coding N/A Similar to MRNA, partial cds, clone: RAFL22-26-L17. (Fragment). (Os01t0101300-01) N/A N/A N/A Os01g0101300 gene:Os01g0101300 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 63350 66302 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101300-01 N/A N/A gene:Os01g0101300 N/A N/A Os01t0101300-01
|
||||
1 irgsp exon 63350 63783 . -1 . N/A N/A 1 N/A -1 0 Os01t0101300-01.exon7 N/A Os01t0101300-01.exon7 N/A Os01t0101300-01.exon7 transcript:Os01t0101300-01 N/A 7 N/A
|
||||
1 irgsp exon 63877 64020 . -1 . N/A N/A 1 N/A 0 0 Os01t0101300-01.exon6 N/A Os01t0101300-01.exon6 N/A Os01t0101300-01.exon6 transcript:Os01t0101300-01 N/A 6 N/A
|
||||
1 irgsp exon 64339 64431 . -1 . N/A N/A 1 N/A 0 0 Os01t0101300-01.exon5 N/A Os01t0101300-01.exon5 N/A Os01t0101300-01.exon5 transcript:Os01t0101300-01 N/A 5 N/A
|
||||
1 irgsp exon 64665 64779 . -1 . N/A N/A 1 N/A 0 2 Os01t0101300-01.exon4 N/A Os01t0101300-01.exon4 N/A Os01t0101300-01.exon4 transcript:Os01t0101300-01 N/A 4 N/A
|
||||
1 irgsp exon 64902 65152 . -1 . N/A N/A 1 N/A 2 0 Os01t0101300-01.exon3 N/A Os01t0101300-01.exon3 N/A Os01t0101300-01.exon3 transcript:Os01t0101300-01 N/A 3 N/A
|
||||
1 irgsp exon 65248 65431 . -1 . N/A N/A 1 N/A 0 2 Os01t0101300-01.exon2 N/A Os01t0101300-01.exon2 N/A Os01t0101300-01.exon2 transcript:Os01t0101300-01 N/A 2 N/A
|
||||
1 irgsp exon 65628 66302 . -1 . N/A N/A 1 N/A 2 -1 Os01t0101300-01.exon1 N/A Os01t0101300-01.exon1 N/A Os01t0101300-01.exon1 transcript:Os01t0101300-01 N/A 1 N/A
|
||||
1 irgsp CDS 63670 63783 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101300-01 N/A N/A transcript:Os01t0101300-01 Os01t0101300-01 N/A N/A
|
||||
1 irgsp CDS 63877 64020 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101300-01 N/A N/A transcript:Os01t0101300-01 Os01t0101300-01 N/A N/A
|
||||
1 irgsp CDS 64339 64431 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101300-01 N/A N/A transcript:Os01t0101300-01 Os01t0101300-01 N/A N/A
|
||||
1 irgsp CDS 64665 64779 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101300-01 N/A N/A transcript:Os01t0101300-01 Os01t0101300-01 N/A N/A
|
||||
1 irgsp CDS 64902 65152 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101300-01 N/A N/A transcript:Os01t0101300-01 Os01t0101300-01 N/A N/A
|
||||
1 irgsp CDS 65248 65431 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101300-01 N/A N/A transcript:Os01t0101300-01 Os01t0101300-01 N/A N/A
|
||||
1 irgsp CDS 65628 65950 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101300-01 N/A N/A transcript:Os01t0101300-01 Os01t0101300-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 65951 66302 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-15 N/A N/A transcript:Os01t0101300-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 63350 63669 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-15 N/A N/A transcript:Os01t0101300-01 N/A N/A N/A
|
||||
1 irgsp gene 72816 78349 . 1 . N/A protein_coding N/A Immunoglobulin-like fold domain containing protein. (Os01t0101600-01);Immunoglobulin-like fold domain containing protein. (Os01t0101600-02);Hypothetical conserved gene. (Os01t0101600-03) N/A N/A N/A Os01g0101600 gene:Os01g0101600 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 72816 78349 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101600-01 N/A N/A gene:Os01g0101600 N/A N/A Os01t0101600-01
|
||||
1 irgsp exon 72816 73935 . 1 . N/A N/A 0 N/A 1 -1 Os01t0101600-01.exon1 N/A Os01t0101600-01.exon1 N/A Os01t0101600-01.exon1 transcript:Os01t0101600-01 N/A 1 N/A
|
||||
1 irgsp exon 74468 74981 . 1 . N/A N/A 0 N/A 2 1 Os01t0101600-02.exon2 N/A Os01t0101600-02.exon2 N/A Os01t0101600-02.exon2 transcript:Os01t0101600-01 N/A 2 N/A
|
||||
1 irgsp exon 75619 77205 . 1 . N/A N/A 0 N/A -1 2 Os01t0101600-01.exon3 N/A Os01t0101600-01.exon3 N/A Os01t0101600-01.exon3 transcript:Os01t0101600-01 N/A 3 N/A
|
||||
1 irgsp exon 77333 78349 . 1 . N/A N/A 0 N/A -1 -1 Os01t0101600-01.exon4 N/A Os01t0101600-01.exon4 N/A Os01t0101600-01.exon4 transcript:Os01t0101600-01 N/A 4 N/A
|
||||
1 irgsp CDS 72903 73935 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101600-01 N/A N/A transcript:Os01t0101600-01 Os01t0101600-01 N/A N/A
|
||||
1 irgsp CDS 74468 74981 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101600-01 N/A N/A transcript:Os01t0101600-01 Os01t0101600-01 N/A N/A
|
||||
1 irgsp CDS 75619 77008 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101600-01 N/A N/A transcript:Os01t0101600-01 Os01t0101600-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 72816 72902 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-16 N/A N/A transcript:Os01t0101600-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 77009 77205 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-16 N/A N/A transcript:Os01t0101600-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 77333 78349 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-17 N/A N/A transcript:Os01t0101600-01 N/A N/A N/A
|
||||
1 irgsp mRNA 72823 77699 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101600-02 N/A N/A gene:Os01g0101600 N/A N/A Os01t0101600-02
|
||||
1 irgsp exon 72823 73935 . 1 . N/A N/A 0 N/A 1 -1 Os01t0101600-02.exon1 N/A Os01t0101600-02.exon1 N/A Os01t0101600-02.exon1 transcript:Os01t0101600-02 N/A 1 N/A
|
||||
1 irgsp exon 74468 74981 . 1 . N/A N/A 0 N/A 2 1 Os01t0101600-02.exon2 N/A agat-exon-3 N/A Os01t0101600-02.exon2 transcript:Os01t0101600-02 N/A 2 N/A
|
||||
1 irgsp exon 75619 77699 . 1 . N/A N/A 0 N/A -1 2 Os01t0101600-02.exon3 N/A Os01t0101600-02.exon3 N/A Os01t0101600-02.exon3 transcript:Os01t0101600-02 N/A 3 N/A
|
||||
1 irgsp CDS 72903 73935 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101600-02 N/A N/A transcript:Os01t0101600-02 Os01t0101600-02 N/A N/A
|
||||
1 irgsp CDS 74468 74981 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101600-02 N/A N/A transcript:Os01t0101600-02 Os01t0101600-02 N/A N/A
|
||||
1 irgsp CDS 75619 77008 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101600-02 N/A N/A transcript:Os01t0101600-02 Os01t0101600-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 72823 72902 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-17 N/A N/A transcript:Os01t0101600-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 77009 77699 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-18 N/A N/A transcript:Os01t0101600-02 N/A N/A N/A
|
||||
1 irgsp mRNA 75942 77699 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101600-03 N/A N/A gene:Os01g0101600 N/A N/A Os01t0101600-03
|
||||
1 irgsp exon 75942 77699 . 1 . N/A N/A 0 N/A -1 -1 Os01t0101600-03.exon1 N/A Os01t0101600-03.exon1 N/A Os01t0101600-03.exon1 transcript:Os01t0101600-03 N/A 1 N/A
|
||||
1 irgsp CDS 75944 77008 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101600-03 N/A N/A transcript:Os01t0101600-03 Os01t0101600-03 N/A N/A
|
||||
1 irgsp five_prime_UTR 75942 75943 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-18 N/A N/A transcript:Os01t0101600-03 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 77009 77699 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-19 N/A N/A transcript:Os01t0101600-03 N/A N/A N/A
|
||||
1 irgsp gene 82426 84095 . 1 . N/A protein_coding N/A Similar to chaperone protein dnaJ 20. (Os01t0101700-00) N/A N/A N/A Os01g0101700 gene:Os01g0101700 irgspv1.0-20170804-genes DnaJ domain protein C1, rice DJC26 homolog N/A N/A N/A N/A
|
||||
1 irgsp mRNA 82426 84095 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101700-00 N/A N/A gene:Os01g0101700 N/A N/A Os01t0101700-00
|
||||
1 irgsp exon 82426 82932 . 1 . N/A N/A 1 N/A 0 -1 Os01t0101700-00.exon1 N/A Os01t0101700-00.exon1 N/A Os01t0101700-00.exon1 transcript:Os01t0101700-00 N/A 1 N/A
|
||||
1 irgsp exon 83724 84095 . 1 . N/A N/A 1 N/A -1 0 Os01t0101700-00.exon2 N/A Os01t0101700-00.exon2 N/A Os01t0101700-00.exon2 transcript:Os01t0101700-00 N/A 2 N/A
|
||||
1 irgsp CDS 82507 82932 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101700-00 N/A N/A transcript:Os01t0101700-00 Os01t0101700-00 N/A N/A
|
||||
1 irgsp CDS 83724 83864 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101700-00 N/A N/A transcript:Os01t0101700-00 Os01t0101700-00 N/A N/A
|
||||
1 irgsp five_prime_UTR 82426 82506 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-19 N/A N/A transcript:Os01t0101700-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 83865 84095 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-20 N/A N/A transcript:Os01t0101700-00 N/A N/A N/A
|
||||
1 irgsp gene 85337 88844 . 1 . N/A protein_coding N/A Conserved hypothetical protein. (Os01t0101800-01) N/A N/A N/A Os01g0101800 gene:Os01g0101800 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 85337 88844 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101800-01 N/A N/A gene:Os01g0101800 N/A N/A Os01t0101800-01
|
||||
1 irgsp exon 85337 85600 . 1 . N/A N/A 1 N/A 0 -1 Os01t0101800-01.exon1 N/A Os01t0101800-01.exon1 N/A Os01t0101800-01.exon1 transcript:Os01t0101800-01 N/A 1 N/A
|
||||
1 irgsp exon 85737 85830 . 1 . N/A N/A 1 N/A 1 0 Os01t0101800-01.exon2 N/A Os01t0101800-01.exon2 N/A Os01t0101800-01.exon2 transcript:Os01t0101800-01 N/A 2 N/A
|
||||
1 irgsp exon 85935 86086 . 1 . N/A N/A 1 N/A 0 1 Os01t0101800-01.exon3 N/A Os01t0101800-01.exon3 N/A Os01t0101800-01.exon3 transcript:Os01t0101800-01 N/A 3 N/A
|
||||
1 irgsp exon 86212 86299 . 1 . N/A N/A 1 N/A 1 0 Os01t0101800-01.exon4 N/A Os01t0101800-01.exon4 N/A Os01t0101800-01.exon4 transcript:Os01t0101800-01 N/A 4 N/A
|
||||
1 irgsp exon 86399 87681 . 1 . N/A N/A 1 N/A 0 1 Os01t0101800-01.exon5 N/A Os01t0101800-01.exon5 N/A Os01t0101800-01.exon5 transcript:Os01t0101800-01 N/A 5 N/A
|
||||
1 irgsp exon 88291 88398 . 1 . N/A N/A 1 N/A 0 0 Os01t0101800-01.exon6 N/A Os01t0101800-01.exon6 N/A Os01t0101800-01.exon6 transcript:Os01t0101800-01 N/A 6 N/A
|
||||
1 irgsp exon 88500 88844 . 1 . N/A N/A 1 N/A -1 0 Os01t0101800-01.exon7 N/A Os01t0101800-01.exon7 N/A Os01t0101800-01.exon7 transcript:Os01t0101800-01 N/A 7 N/A
|
||||
1 irgsp CDS 85379 85600 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101800-01 N/A N/A transcript:Os01t0101800-01 Os01t0101800-01 N/A N/A
|
||||
1 irgsp CDS 85737 85830 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101800-01 N/A N/A transcript:Os01t0101800-01 Os01t0101800-01 N/A N/A
|
||||
1 irgsp CDS 85935 86086 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101800-01 N/A N/A transcript:Os01t0101800-01 Os01t0101800-01 N/A N/A
|
||||
1 irgsp CDS 86212 86299 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101800-01 N/A N/A transcript:Os01t0101800-01 Os01t0101800-01 N/A N/A
|
||||
1 irgsp CDS 86399 87681 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101800-01 N/A N/A transcript:Os01t0101800-01 Os01t0101800-01 N/A N/A
|
||||
1 irgsp CDS 88291 88398 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101800-01 N/A N/A transcript:Os01t0101800-01 Os01t0101800-01 N/A N/A
|
||||
1 irgsp CDS 88500 88583 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101800-01 N/A N/A transcript:Os01t0101800-01 Os01t0101800-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 85337 85378 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-20 N/A N/A transcript:Os01t0101800-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 88584 88844 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-21 N/A N/A transcript:Os01t0101800-01 N/A N/A N/A
|
||||
1 irgsp gene 86211 88583 . -1 . N/A protein_coding N/A Hypothetical protein. (Os01t0101850-00) N/A N/A N/A Os01g0101850 gene:Os01g0101850 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 86211 88583 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101850-00 N/A N/A gene:Os01g0101850 N/A N/A Os01t0101850-00
|
||||
1 irgsp exon 86211 86277 . -1 . N/A N/A 1 N/A -1 -1 Os01t0101850-00.exon4 N/A Os01t0101850-00.exon4 N/A Os01t0101850-00.exon4 transcript:Os01t0101850-00 N/A 4 N/A
|
||||
1 irgsp exon 86384 87694 . -1 . N/A N/A 1 N/A -1 -1 Os01t0101850-00.exon3 N/A Os01t0101850-00.exon3 N/A Os01t0101850-00.exon3 transcript:Os01t0101850-00 N/A 3 N/A
|
||||
1 irgsp exon 88308 88396 . -1 . N/A N/A 1 N/A -1 -1 Os01t0101850-00.exon2 N/A Os01t0101850-00.exon2 N/A Os01t0101850-00.exon2 transcript:Os01t0101850-00 N/A 2 N/A
|
||||
1 irgsp exon 88496 88583 . -1 . N/A N/A 1 N/A -1 -1 Os01t0101850-00.exon1 N/A Os01t0101850-00.exon1 N/A Os01t0101850-00.exon1 transcript:Os01t0101850-00 N/A 1 N/A
|
||||
1 irgsp CDS 87327 87662 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101850-00 N/A N/A transcript:Os01t0101850-00 Os01t0101850-00 N/A N/A
|
||||
1 irgsp five_prime_UTR 87663 87694 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-21 N/A N/A transcript:Os01t0101850-00 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 88308 88396 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-22 N/A N/A transcript:Os01t0101850-00 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 88496 88583 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-23 N/A N/A transcript:Os01t0101850-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 86211 86277 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-22 N/A N/A transcript:Os01t0101850-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 86384 87326 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-23 N/A N/A transcript:Os01t0101850-00 N/A N/A N/A
|
||||
1 irgsp gene 88883 89228 . -1 . N/A protein_coding N/A Similar to OSIGBa0075F02.3 protein. (Os01t0101900-00) N/A N/A N/A Os01g0101900 gene:Os01g0101900 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 88883 89228 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0101900-00 N/A N/A gene:Os01g0101900 N/A N/A Os01t0101900-00
|
||||
1 irgsp exon 88883 89228 . -1 . N/A N/A 1 N/A -1 -1 Os01t0101900-00.exon1 N/A Os01t0101900-00.exon1 N/A Os01t0101900-00.exon1 transcript:Os01t0101900-00 N/A 1 N/A
|
||||
1 irgsp CDS 88986 89204 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0101900-00 N/A N/A transcript:Os01t0101900-00 Os01t0101900-00 N/A N/A
|
||||
1 irgsp five_prime_UTR 89205 89228 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-24 N/A N/A transcript:Os01t0101900-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 88883 88985 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-24 N/A N/A transcript:Os01t0101900-00 N/A N/A N/A
|
||||
1 irgsp gene 89763 91465 . -1 . N/A protein_coding N/A Phosphoesterase family protein. (Os01t0102000-01) N/A N/A N/A Os01g0102000 gene:Os01g0102000 irgspv1.0-20170804-genes NON-SPECIFIC PHOSPHOLIPASE C5 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 89763 91465 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102000-01 N/A N/A gene:Os01g0102000 N/A N/A Os01t0102000-01
|
||||
1 irgsp exon 89763 91465 . -1 . N/A N/A 1 N/A -1 -1 Os01t0102000-01.exon1 N/A Os01t0102000-01.exon1 N/A Os01t0102000-01.exon1 transcript:Os01t0102000-01 N/A 1 N/A
|
||||
1 irgsp CDS 89825 91411 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102000-01 N/A N/A transcript:Os01t0102000-01 Os01t0102000-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 91412 91465 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-25 N/A N/A transcript:Os01t0102000-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 89763 89824 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-25 N/A N/A transcript:Os01t0102000-01 N/A N/A N/A
|
||||
1 irgsp gene 134300 135439 . 1 . N/A protein_coding N/A Thylakoid lumen protein, Photosynthesis and chloroplast development (Os01t0102300-01) N/A N/A N/A Os01g0102300 gene:Os01g0102300 irgspv1.0-20170804-genes OsTLP27 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 134300 135439 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102300-01 N/A N/A gene:Os01g0102300 N/A N/A Os01t0102300-01
|
||||
1 irgsp exon 134300 134615 . 1 . N/A N/A 1 N/A 2 -1 Os01t0102300-01.exon1 N/A Os01t0102300-01.exon1 N/A Os01t0102300-01.exon1 transcript:Os01t0102300-01 N/A 1 N/A
|
||||
1 irgsp exon 134698 134824 . 1 . N/A N/A 1 N/A 0 2 Os01t0102300-01.exon2 N/A Os01t0102300-01.exon2 N/A Os01t0102300-01.exon2 transcript:Os01t0102300-01 N/A 2 N/A
|
||||
1 irgsp exon 134912 135439 . 1 . N/A N/A 1 N/A -1 0 Os01t0102300-01.exon3 N/A Os01t0102300-01.exon3 N/A Os01t0102300-01.exon3 transcript:Os01t0102300-01 N/A 3 N/A
|
||||
1 irgsp CDS 134311 134615 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102300-01 N/A N/A transcript:Os01t0102300-01 Os01t0102300-01 N/A N/A
|
||||
1 irgsp CDS 134698 134824 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102300-01 N/A N/A transcript:Os01t0102300-01 Os01t0102300-01 N/A N/A
|
||||
1 irgsp CDS 134912 135253 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102300-01 N/A N/A transcript:Os01t0102300-01 Os01t0102300-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 134300 134310 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-26 N/A N/A transcript:Os01t0102300-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 135254 135439 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-26 N/A N/A transcript:Os01t0102300-01 N/A N/A N/A
|
||||
1 irgsp gene 139826 141555 . 1 . N/A protein_coding N/A Histone-fold domain containing protein. (Os01t0102400-01) N/A N/A N/A Os01g0102400 gene:Os01g0102400 irgspv1.0-20170804-genes HAP5H SUBUNIT OF CCAAT-BOX BINDING COMPLEX N/A N/A N/A N/A
|
||||
1 irgsp mRNA 139826 141555 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102400-01 N/A N/A gene:Os01g0102400 N/A N/A Os01t0102400-01
|
||||
1 irgsp exon 139826 139906 . 1 . N/A N/A 1 N/A -1 -1 Os01t0102400-01.exon1 N/A Os01t0102400-01.exon1 N/A Os01t0102400-01.exon1 transcript:Os01t0102400-01 N/A 1 N/A
|
||||
1 irgsp exon 140120 141555 . 1 . N/A N/A 1 N/A -1 -1 Os01t0102400-01.exon2 N/A Os01t0102400-01.exon2 N/A Os01t0102400-01.exon2 transcript:Os01t0102400-01 N/A 2 N/A
|
||||
1 irgsp CDS 140150 141415 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102400-01 N/A N/A transcript:Os01t0102400-01 Os01t0102400-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 139826 139906 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-27 N/A N/A transcript:Os01t0102400-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 140120 140149 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-28 N/A N/A transcript:Os01t0102400-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 141416 141555 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-27 N/A N/A transcript:Os01t0102400-01 N/A N/A N/A
|
||||
1 irgsp gene 141959 144554 . 1 . N/A protein_coding N/A Conserved hypothetical protein. (Os01t0102500-01) N/A N/A N/A Os01g0102500 gene:Os01g0102500 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 141959 144554 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102500-01 N/A N/A gene:Os01g0102500 N/A N/A Os01t0102500-01
|
||||
1 irgsp exon 141959 142631 . 1 . N/A N/A 1 N/A 2 -1 Os01t0102500-01.exon1 N/A Os01t0102500-01.exon1 N/A Os01t0102500-01.exon1 transcript:Os01t0102500-01 N/A 1 N/A
|
||||
1 irgsp exon 143191 143431 . 1 . N/A N/A 1 N/A 0 2 Os01t0102500-01.exon2 N/A Os01t0102500-01.exon2 N/A Os01t0102500-01.exon2 transcript:Os01t0102500-01 N/A 2 N/A
|
||||
1 irgsp exon 143563 143680 . 1 . N/A N/A 1 N/A 1 0 Os01t0102500-01.exon3 N/A Os01t0102500-01.exon3 N/A Os01t0102500-01.exon3 transcript:Os01t0102500-01 N/A 3 N/A
|
||||
1 irgsp exon 143817 144554 . 1 . N/A N/A 1 N/A -1 1 Os01t0102500-01.exon4 N/A Os01t0102500-01.exon4 N/A Os01t0102500-01.exon4 transcript:Os01t0102500-01 N/A 4 N/A
|
||||
1 irgsp CDS 142084 142631 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102500-01 N/A N/A transcript:Os01t0102500-01 Os01t0102500-01 N/A N/A
|
||||
1 irgsp CDS 143191 143431 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102500-01 N/A N/A transcript:Os01t0102500-01 Os01t0102500-01 N/A N/A
|
||||
1 irgsp CDS 143563 143680 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102500-01 N/A N/A transcript:Os01t0102500-01 Os01t0102500-01 N/A N/A
|
||||
1 irgsp CDS 143817 143908 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102500-01 N/A N/A transcript:Os01t0102500-01 Os01t0102500-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 141959 142083 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-29 N/A N/A transcript:Os01t0102500-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 143909 144554 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-28 N/A N/A transcript:Os01t0102500-01 N/A N/A N/A
|
||||
1 irgsp gene 145603 147847 . 1 . N/A protein_coding N/A Shikimate kinase domain containing protein. (Os01t0102600-01);Similar to shikimate kinase family protein. (Os01t0102600-02) N/A N/A N/A Os01g0102600 gene:Os01g0102600 irgspv1.0-20170804-genes Shikimate kinase 4 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 145603 147847 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102600-01 N/A N/A gene:Os01g0102600 N/A N/A Os01t0102600-01
|
||||
1 irgsp exon 145603 145786 . 1 . N/A N/A 0 N/A 1 -1 Os01t0102600-01.exon1 N/A Os01t0102600-01.exon1 N/A Os01t0102600-01.exon1 transcript:Os01t0102600-01 N/A 1 N/A
|
||||
1 irgsp exon 145905 145951 . 1 . N/A N/A 0 N/A 0 1 Os01t0102600-01.exon2 N/A Os01t0102600-01.exon2 N/A Os01t0102600-01.exon2 transcript:Os01t0102600-01 N/A 2 N/A
|
||||
1 irgsp exon 146028 146082 . 1 . N/A N/A 0 N/A 1 0 Os01t0102600-01.exon3 N/A Os01t0102600-01.exon3 N/A Os01t0102600-01.exon3 transcript:Os01t0102600-01 N/A 3 N/A
|
||||
1 irgsp exon 146179 146339 . 1 . N/A N/A 0 N/A 0 1 Os01t0102600-01.exon4 N/A Os01t0102600-01.exon4 N/A Os01t0102600-01.exon4 transcript:Os01t0102600-01 N/A 4 N/A
|
||||
1 irgsp exon 146450 146532 . 1 . N/A N/A 0 N/A 2 0 Os01t0102600-01.exon5 N/A Os01t0102600-01.exon5 N/A Os01t0102600-01.exon5 transcript:Os01t0102600-01 N/A 5 N/A
|
||||
1 irgsp exon 146611 146719 . 1 . N/A N/A 0 N/A 0 2 Os01t0102600-01.exon6 N/A Os01t0102600-01.exon6 N/A Os01t0102600-01.exon6 transcript:Os01t0102600-01 N/A 6 N/A
|
||||
1 irgsp exon 147106 147184 . 1 . N/A N/A 0 N/A 1 0 Os01t0102600-01.exon7 N/A Os01t0102600-01.exon7 N/A Os01t0102600-01.exon7 transcript:Os01t0102600-01 N/A 7 N/A
|
||||
1 irgsp exon 147311 147375 . 1 . N/A N/A 1 N/A 0 1 Os01t0102600-02.exon2 N/A Os01t0102600-02.exon2 N/A Os01t0102600-02.exon2 transcript:Os01t0102600-01 N/A 8 N/A
|
||||
1 irgsp exon 147507 147847 . 1 . N/A N/A 0 N/A -1 0 Os01t0102600-01.exon9 N/A Os01t0102600-01.exon9 N/A Os01t0102600-01.exon9 transcript:Os01t0102600-01 N/A 9 N/A
|
||||
1 irgsp CDS 145645 145786 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-01 N/A N/A transcript:Os01t0102600-01 Os01t0102600-01 N/A N/A
|
||||
1 irgsp CDS 145905 145951 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-01 N/A N/A transcript:Os01t0102600-01 Os01t0102600-01 N/A N/A
|
||||
1 irgsp CDS 146028 146082 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-01 N/A N/A transcript:Os01t0102600-01 Os01t0102600-01 N/A N/A
|
||||
1 irgsp CDS 146179 146339 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-01 N/A N/A transcript:Os01t0102600-01 Os01t0102600-01 N/A N/A
|
||||
1 irgsp CDS 146450 146532 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-01 N/A N/A transcript:Os01t0102600-01 Os01t0102600-01 N/A N/A
|
||||
1 irgsp CDS 146611 146719 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-01 N/A N/A transcript:Os01t0102600-01 Os01t0102600-01 N/A N/A
|
||||
1 irgsp CDS 147106 147184 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-01 N/A N/A transcript:Os01t0102600-01 Os01t0102600-01 N/A N/A
|
||||
1 irgsp CDS 147311 147375 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-01 N/A N/A transcript:Os01t0102600-01 Os01t0102600-01 N/A N/A
|
||||
1 irgsp CDS 147507 147575 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-01 N/A N/A transcript:Os01t0102600-01 Os01t0102600-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 145603 145644 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-30 N/A N/A transcript:Os01t0102600-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 147576 147847 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-29 N/A N/A transcript:Os01t0102600-01 N/A N/A N/A
|
||||
1 irgsp mRNA 147104 147805 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102600-02 N/A N/A gene:Os01g0102600 N/A N/A Os01t0102600-02
|
||||
1 irgsp exon 147104 147184 . 1 . N/A N/A 0 N/A 1 -1 Os01t0102600-02.exon1 N/A Os01t0102600-02.exon1 N/A Os01t0102600-02.exon1 transcript:Os01t0102600-02 N/A 1 N/A
|
||||
1 irgsp exon 147311 147375 . 1 . N/A N/A 1 N/A 0 1 Os01t0102600-02.exon2 N/A agat-exon-4 N/A Os01t0102600-02.exon2 transcript:Os01t0102600-02 N/A 2 N/A
|
||||
1 irgsp exon 147507 147805 . 1 . N/A N/A 0 N/A -1 0 Os01t0102600-02.exon3 N/A Os01t0102600-02.exon3 N/A Os01t0102600-02.exon3 transcript:Os01t0102600-02 N/A 3 N/A
|
||||
1 irgsp CDS 147106 147184 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-02 N/A N/A transcript:Os01t0102600-02 Os01t0102600-02 N/A N/A
|
||||
1 irgsp CDS 147311 147375 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-02 N/A N/A transcript:Os01t0102600-02 Os01t0102600-02 N/A N/A
|
||||
1 irgsp CDS 147507 147575 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102600-02 N/A N/A transcript:Os01t0102600-02 Os01t0102600-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 147104 147105 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-31 N/A N/A transcript:Os01t0102600-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 147576 147805 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-30 N/A N/A transcript:Os01t0102600-02 N/A N/A N/A
|
||||
1 irgsp gene 148085 150568 . 1 . N/A protein_coding N/A Translocon-associated beta family protein. (Os01t0102700-01) N/A N/A N/A Os01g0102700 gene:Os01g0102700 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 148085 150568 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102700-01 N/A N/A gene:Os01g0102700 N/A N/A Os01t0102700-01
|
||||
1 irgsp exon 148085 148313 . 1 . N/A N/A 1 N/A 2 -1 Os01t0102700-01.exon1 N/A Os01t0102700-01.exon1 N/A Os01t0102700-01.exon1 transcript:Os01t0102700-01 N/A 1 N/A
|
||||
1 irgsp exon 149450 149548 . 1 . N/A N/A 1 N/A 2 2 Os01t0102700-01.exon2 N/A Os01t0102700-01.exon2 N/A Os01t0102700-01.exon2 transcript:Os01t0102700-01 N/A 2 N/A
|
||||
1 irgsp exon 149634 149742 . 1 . N/A N/A 1 N/A 0 2 Os01t0102700-01.exon3 N/A Os01t0102700-01.exon3 N/A Os01t0102700-01.exon3 transcript:Os01t0102700-01 N/A 3 N/A
|
||||
1 irgsp exon 149856 149931 . 1 . N/A N/A 1 N/A 1 0 Os01t0102700-01.exon4 N/A Os01t0102700-01.exon4 N/A Os01t0102700-01.exon4 transcript:Os01t0102700-01 N/A 4 N/A
|
||||
1 irgsp exon 150152 150568 . 1 . N/A N/A 1 N/A -1 1 Os01t0102700-01.exon5 N/A Os01t0102700-01.exon5 N/A Os01t0102700-01.exon5 transcript:Os01t0102700-01 N/A 5 N/A
|
||||
1 irgsp CDS 148147 148313 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102700-01 N/A N/A transcript:Os01t0102700-01 Os01t0102700-01 N/A N/A
|
||||
1 irgsp CDS 149450 149548 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102700-01 N/A N/A transcript:Os01t0102700-01 Os01t0102700-01 N/A N/A
|
||||
1 irgsp CDS 149634 149742 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102700-01 N/A N/A transcript:Os01t0102700-01 Os01t0102700-01 N/A N/A
|
||||
1 irgsp CDS 149856 149931 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102700-01 N/A N/A transcript:Os01t0102700-01 Os01t0102700-01 N/A N/A
|
||||
1 irgsp CDS 150152 150318 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102700-01 N/A N/A transcript:Os01t0102700-01 Os01t0102700-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 148085 148146 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-32 N/A N/A transcript:Os01t0102700-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 150319 150568 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-31 N/A N/A transcript:Os01t0102700-01 N/A N/A N/A
|
||||
1 irgsp gene 152853 156449 . 1 . N/A protein_coding N/A Similar to chromatin remodeling complex subunit. (Os01t0102800-01) N/A N/A N/A Os01g0102800 gene:Os01g0102800 irgspv1.0-20170804-genes Cockayne syndrome WD-repeat protein N/A N/A N/A N/A
|
||||
1 irgsp mRNA 152853 156449 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102800-01 N/A N/A gene:Os01g0102800 N/A N/A Os01t0102800-01
|
||||
1 irgsp exon 152853 153025 . 1 . N/A N/A 1 N/A 1 -1 Os01t0102800-01.exon1 N/A Os01t0102800-01.exon1 N/A Os01t0102800-01.exon1 transcript:Os01t0102800-01 N/A 1 N/A
|
||||
1 irgsp exon 153178 154646 . 1 . N/A N/A 1 N/A 0 1 Os01t0102800-01.exon2 N/A Os01t0102800-01.exon2 N/A Os01t0102800-01.exon2 transcript:Os01t0102800-01 N/A 2 N/A
|
||||
1 irgsp exon 155010 155450 . 1 . N/A N/A 1 N/A 0 0 Os01t0102800-01.exon3 N/A Os01t0102800-01.exon3 N/A Os01t0102800-01.exon3 transcript:Os01t0102800-01 N/A 3 N/A
|
||||
1 irgsp exon 155543 156449 . 1 . N/A N/A 1 N/A -1 0 Os01t0102800-01.exon4 N/A Os01t0102800-01.exon4 N/A Os01t0102800-01.exon4 transcript:Os01t0102800-01 N/A 4 N/A
|
||||
1 irgsp CDS 152854 153025 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102800-01 N/A N/A transcript:Os01t0102800-01 Os01t0102800-01 N/A N/A
|
||||
1 irgsp CDS 153178 154646 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102800-01 N/A N/A transcript:Os01t0102800-01 Os01t0102800-01 N/A N/A
|
||||
1 irgsp CDS 155010 155450 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102800-01 N/A N/A transcript:Os01t0102800-01 Os01t0102800-01 N/A N/A
|
||||
1 irgsp CDS 155543 156214 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102800-01 N/A N/A transcript:Os01t0102800-01 Os01t0102800-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 152853 152853 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-33 N/A N/A transcript:Os01t0102800-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 156215 156449 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-32 N/A N/A transcript:Os01t0102800-01 N/A N/A N/A
|
||||
1 irgsp gene 164577 168921 . 1 . N/A protein_coding N/A Similar to nitrilase 2. (Os01t0102850-00) N/A N/A N/A Os01g0102850 gene:Os01g0102850 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 164577 168921 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102850-00 N/A N/A gene:Os01g0102850 N/A N/A Os01t0102850-00
|
||||
1 irgsp exon 164577 164905 . 1 . N/A N/A 1 N/A -1 -1 Os01t0102850-00.exon1 N/A Os01t0102850-00.exon1 N/A Os01t0102850-00.exon1 transcript:Os01t0102850-00 N/A 1 N/A
|
||||
1 irgsp exon 168499 168921 . 1 . N/A N/A 1 N/A 0 -1 Os01t0102850-00.exon2 N/A Os01t0102850-00.exon2 N/A Os01t0102850-00.exon2 transcript:Os01t0102850-00 N/A 2 N/A
|
||||
1 irgsp CDS 168805 168921 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102850-00 N/A N/A transcript:Os01t0102850-00 Os01t0102850-00 N/A N/A
|
||||
1 irgsp five_prime_UTR 164577 164905 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-34 N/A N/A transcript:Os01t0102850-00 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 168499 168804 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-35 N/A N/A transcript:Os01t0102850-00 N/A N/A N/A
|
||||
1 irgsp gene 169390 170316 . -1 . N/A protein_coding N/A Light-regulated protein, Regulation of light-dependent attachment of LEAF-TYPE FERREDOXIN-NADP+ OXIDOREDUCTASE (LFNR) to the thylakoid membrane (Os01t0102900-01) N/A N/A N/A Os01g0102900 gene:Os01g0102900 irgspv1.0-20170804-genes LIGHT-REGULATED GENE 1 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 169390 170316 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0102900-01 N/A N/A gene:Os01g0102900 N/A N/A Os01t0102900-01
|
||||
1 irgsp exon 169390 169656 . -1 . N/A N/A 1 N/A -1 2 Os01t0102900-01.exon3 N/A Os01t0102900-01.exon3 N/A Os01t0102900-01.exon3 transcript:Os01t0102900-01 N/A 3 N/A
|
||||
1 irgsp exon 169751 169909 . -1 . N/A N/A 1 N/A 2 2 Os01t0102900-01.exon2 N/A Os01t0102900-01.exon2 N/A Os01t0102900-01.exon2 transcript:Os01t0102900-01 N/A 2 N/A
|
||||
1 irgsp exon 170091 170316 . -1 . N/A N/A 1 N/A 2 -1 Os01t0102900-01.exon1 N/A Os01t0102900-01.exon1 N/A Os01t0102900-01.exon1 transcript:Os01t0102900-01 N/A 1 N/A
|
||||
1 irgsp CDS 169599 169656 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102900-01 N/A N/A transcript:Os01t0102900-01 Os01t0102900-01 N/A N/A
|
||||
1 irgsp CDS 169751 169909 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102900-01 N/A N/A transcript:Os01t0102900-01 Os01t0102900-01 N/A N/A
|
||||
1 irgsp CDS 170091 170260 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0102900-01 N/A N/A transcript:Os01t0102900-01 Os01t0102900-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 170261 170316 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-36 N/A N/A transcript:Os01t0102900-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 169390 169598 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-33 N/A N/A transcript:Os01t0102900-01 N/A N/A N/A
|
||||
1 irgsp gene 170798 173144 . -1 . N/A protein_coding N/A Snf7 family protein. (Os01t0103000-01) N/A N/A N/A Os01g0103000 gene:Os01g0103000 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 170798 173144 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103000-01 N/A N/A gene:Os01g0103000 N/A N/A Os01t0103000-01
|
||||
1 irgsp exon 170798 171095 . -1 . N/A N/A 1 N/A -1 0 Os01t0103000-01.exon7 N/A Os01t0103000-01.exon7 N/A Os01t0103000-01.exon7 transcript:Os01t0103000-01 N/A 7 N/A
|
||||
1 irgsp exon 171406 171554 . -1 . N/A N/A 1 N/A 0 1 Os01t0103000-01.exon6 N/A Os01t0103000-01.exon6 N/A Os01t0103000-01.exon6 transcript:Os01t0103000-01 N/A 6 N/A
|
||||
1 irgsp exon 171764 171875 . -1 . N/A N/A 1 N/A 1 0 Os01t0103000-01.exon5 N/A Os01t0103000-01.exon5 N/A Os01t0103000-01.exon5 transcript:Os01t0103000-01 N/A 5 N/A
|
||||
1 irgsp exon 172398 172469 . -1 . N/A N/A 1 N/A 0 0 Os01t0103000-01.exon4 N/A Os01t0103000-01.exon4 N/A Os01t0103000-01.exon4 transcript:Os01t0103000-01 N/A 4 N/A
|
||||
1 irgsp exon 172578 172671 . -1 . N/A N/A 1 N/A 0 2 Os01t0103000-01.exon3 N/A Os01t0103000-01.exon3 N/A Os01t0103000-01.exon3 transcript:Os01t0103000-01 N/A 3 N/A
|
||||
1 irgsp exon 172770 172921 . -1 . N/A N/A 1 N/A 2 0 Os01t0103000-01.exon2 N/A Os01t0103000-01.exon2 N/A Os01t0103000-01.exon2 transcript:Os01t0103000-01 N/A 2 N/A
|
||||
1 irgsp exon 173004 173144 . -1 . N/A N/A 1 N/A 0 -1 Os01t0103000-01.exon1 N/A Os01t0103000-01.exon1 N/A Os01t0103000-01.exon1 transcript:Os01t0103000-01 N/A 1 N/A
|
||||
1 irgsp CDS 171045 171095 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103000-01 N/A N/A transcript:Os01t0103000-01 Os01t0103000-01 N/A N/A
|
||||
1 irgsp CDS 171406 171554 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103000-01 N/A N/A transcript:Os01t0103000-01 Os01t0103000-01 N/A N/A
|
||||
1 irgsp CDS 171764 171875 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103000-01 N/A N/A transcript:Os01t0103000-01 Os01t0103000-01 N/A N/A
|
||||
1 irgsp CDS 172398 172469 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103000-01 N/A N/A transcript:Os01t0103000-01 Os01t0103000-01 N/A N/A
|
||||
1 irgsp CDS 172578 172671 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103000-01 N/A N/A transcript:Os01t0103000-01 Os01t0103000-01 N/A N/A
|
||||
1 irgsp CDS 172770 172921 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103000-01 N/A N/A transcript:Os01t0103000-01 Os01t0103000-01 N/A N/A
|
||||
1 irgsp CDS 173004 173072 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103000-01 N/A N/A transcript:Os01t0103000-01 Os01t0103000-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 173073 173144 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-37 N/A N/A transcript:Os01t0103000-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 170798 171044 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-34 N/A N/A transcript:Os01t0103000-01 N/A N/A N/A
|
||||
1 irgsp gene 178607 180575 . 1 . N/A protein_coding N/A TGF-beta receptor, type I/II extracellular region family protein. (Os01t0103100-01);Similar to predicted protein. (Os01t0103100-02) N/A N/A N/A Os01g0103100 gene:Os01g0103100 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 178607 180548 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103100-01 N/A N/A gene:Os01g0103100 N/A N/A Os01t0103100-01
|
||||
1 irgsp exon 178607 180548 . 1 . N/A N/A 0 N/A -1 -1 Os01t0103100-01.exon1 N/A Os01t0103100-01.exon1 N/A Os01t0103100-01.exon1 transcript:Os01t0103100-01 N/A 1 N/A
|
||||
1 irgsp CDS 178642 180462 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103100-01 N/A N/A transcript:Os01t0103100-01 Os01t0103100-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 178607 178641 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-38 N/A N/A transcript:Os01t0103100-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 180463 180548 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-35 N/A N/A transcript:Os01t0103100-01 N/A N/A N/A
|
||||
1 irgsp mRNA 178652 180575 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103100-02 N/A N/A gene:Os01g0103100 N/A N/A Os01t0103100-02
|
||||
1 irgsp exon 178652 180575 . 1 . N/A N/A 0 N/A -1 -1 Os01t0103100-02.exon1 N/A Os01t0103100-02.exon1 N/A Os01t0103100-02.exon1 transcript:Os01t0103100-02 N/A 1 N/A
|
||||
1 irgsp CDS 178678 180462 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103100-02 N/A N/A transcript:Os01t0103100-02 Os01t0103100-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 178652 178677 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-39 N/A N/A transcript:Os01t0103100-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 180463 180575 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-36 N/A N/A transcript:Os01t0103100-02 N/A N/A N/A
|
||||
1 irgsp gene 178815 180433 . -1 . N/A protein_coding N/A Hypothetical protein. (Os01t0103075-00) N/A N/A N/A Os01g0103075 gene:Os01g0103075 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 178815 180433 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103075-00 N/A N/A gene:Os01g0103075 N/A N/A Os01t0103075-00
|
||||
1 irgsp exon 178815 180433 . -1 . N/A N/A 1 N/A -1 -1 Os01t0103075-00.exon1 N/A Os01t0103075-00.exon1 N/A Os01t0103075-00.exon1 transcript:Os01t0103075-00 N/A 1 N/A
|
||||
1 irgsp CDS 179512 180054 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103075-00 N/A N/A transcript:Os01t0103075-00 Os01t0103075-00 N/A N/A
|
||||
1 irgsp five_prime_UTR 180055 180433 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-40 N/A N/A transcript:Os01t0103075-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 178815 179511 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-37 N/A N/A transcript:Os01t0103075-00 N/A N/A N/A
|
||||
1 Ensembl_Plants ncRNA_gene 182074 182154 . 1 . N/A tRNA N/A tRNA-Leu for anticodon AAG N/A N/A N/A ENSRNA049442722 gene:ENSRNA049442722 trnascan_gene tRNA-Leu N/A N/A N/A N/A
|
||||
1 Ensembl_Plants tRNA 182074 182154 . 1 . N/A tRNA N/A N/A N/A N/A N/A N/A transcript:ENSRNA049442722-T1 N/A N/A gene:ENSRNA049442722 N/A N/A ENSRNA049442722-T1
|
||||
1 Ensembl_Plants exon 182074 182154 . 1 . N/A N/A 1 N/A -1 -1 ENSRNA049442722-E1 N/A ENSRNA049442722-E1 N/A ENSRNA049442722-E1 transcript:ENSRNA049442722-T1 N/A 1 N/A
|
||||
1 irgsp gene 185189 185828 . -1 . N/A protein_coding N/A Hypothetical gene. (Os01t0103400-01) N/A N/A N/A Os01g0103400 gene:Os01g0103400 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 185189 185828 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103400-01 N/A N/A gene:Os01g0103400 N/A N/A Os01t0103400-01
|
||||
1 irgsp exon 185189 185828 . -1 . N/A N/A 1 N/A -1 -1 Os01t0103400-01.exon1 N/A Os01t0103400-01.exon1 N/A Os01t0103400-01.exon1 transcript:Os01t0103400-01 N/A 1 N/A
|
||||
1 irgsp CDS 185435 185827 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103400-01 N/A N/A transcript:Os01t0103400-01 Os01t0103400-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 185828 185828 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-41 N/A N/A transcript:Os01t0103400-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 185189 185434 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-38 N/A N/A transcript:Os01t0103400-01 N/A N/A N/A
|
||||
1 irgsp repeat_region 186000 186100 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A fakeRepeat2 N/A N/A N/A N/A N/A N/A
|
||||
1 irgsp gene 186250 190904 . -1 . N/A protein_coding N/A Similar to sterol-8,7-isomerase. (Os01t0103600-01);Emopamil-binding family protein. (Os01t0103600-02) N/A N/A N/A Os01g0103600 gene:Os01g0103600 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 186250 190262 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103600-02 N/A N/A gene:Os01g0103600 N/A N/A Os01t0103600-02
|
||||
1 irgsp exon 186250 186771 . -1 . N/A N/A 0 N/A -1 2 Os01t0103600-02.exon4 N/A Os01t0103600-02.exon4 N/A Os01t0103600-02.exon4 transcript:Os01t0103600-02 N/A 4 N/A
|
||||
1 irgsp exon 189607 189715 . -1 . N/A N/A 0 N/A 2 1 Os01t0103600-02.exon3 N/A Os01t0103600-02.exon3 N/A Os01t0103600-02.exon3 transcript:Os01t0103600-02 N/A 3 N/A
|
||||
1 irgsp exon 189841 189990 . -1 . N/A N/A 1 N/A 1 1 Os01t0103600-02.exon2 N/A Os01t0103600-02.exon2 N/A Os01t0103600-02.exon2 transcript:Os01t0103600-02 N/A 2 N/A
|
||||
1 irgsp exon 190087 190262 . -1 . N/A N/A 0 N/A 1 -1 Os01t0103600-02.exon1 N/A Os01t0103600-02.exon1 N/A Os01t0103600-02.exon1 transcript:Os01t0103600-02 N/A 1 N/A
|
||||
1 irgsp CDS 186516 186771 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103600-02 N/A N/A transcript:Os01t0103600-02 Os01t0103600-02 N/A N/A
|
||||
1 irgsp CDS 189607 189715 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103600-02 N/A N/A transcript:Os01t0103600-02 Os01t0103600-02 N/A N/A
|
||||
1 irgsp CDS 189841 189990 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103600-02 N/A N/A transcript:Os01t0103600-02 Os01t0103600-02 N/A N/A
|
||||
1 irgsp CDS 190087 190231 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103600-02 N/A N/A transcript:Os01t0103600-02 Os01t0103600-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 190232 190262 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-42 N/A N/A transcript:Os01t0103600-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 186250 186515 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-39 N/A N/A transcript:Os01t0103600-02 N/A N/A N/A
|
||||
1 irgsp mRNA 187345 190904 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103600-01 N/A N/A gene:Os01g0103600 N/A N/A Os01t0103600-01
|
||||
1 irgsp exon 187345 189715 . -1 . N/A N/A 0 N/A -1 1 Os01t0103600-01.exon3 N/A Os01t0103600-01.exon3 N/A Os01t0103600-01.exon3 transcript:Os01t0103600-01 N/A 3 N/A
|
||||
1 irgsp exon 189841 189990 . -1 . N/A N/A 1 N/A 1 1 Os01t0103600-02.exon2 N/A agat-exon-5 N/A Os01t0103600-02.exon2 transcript:Os01t0103600-01 N/A 2 N/A
|
||||
1 irgsp exon 190087 190904 . -1 . N/A N/A 0 N/A 1 -1 Os01t0103600-01.exon1 N/A Os01t0103600-01.exon1 N/A Os01t0103600-01.exon1 transcript:Os01t0103600-01 N/A 1 N/A
|
||||
1 irgsp CDS 189396 189715 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103600-01 N/A N/A transcript:Os01t0103600-01 Os01t0103600-01 N/A N/A
|
||||
1 irgsp CDS 189841 189990 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103600-01 N/A N/A transcript:Os01t0103600-01 Os01t0103600-01 N/A N/A
|
||||
1 irgsp CDS 190087 190231 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103600-01 N/A N/A transcript:Os01t0103600-01 Os01t0103600-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 190232 190904 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-43 N/A N/A transcript:Os01t0103600-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 187345 189395 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-40 N/A N/A transcript:Os01t0103600-01 N/A N/A N/A
|
||||
1 irgsp gene 187545 188586 . 1 . N/A protein_coding N/A Hypothetical gene. (Os01t0103650-00) N/A N/A N/A Os01g0103650 gene:Os01g0103650 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 187545 188586 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103650-00 N/A N/A gene:Os01g0103650 N/A N/A Os01t0103650-00
|
||||
1 irgsp exon 187545 188020 . 1 . N/A N/A 1 N/A -1 -1 Os01t0103650-00.exon1 N/A Os01t0103650-00.exon1 N/A Os01t0103650-00.exon1 transcript:Os01t0103650-00 N/A 1 N/A
|
||||
1 irgsp exon 188060 188385 . 1 . N/A N/A 1 N/A -1 -1 Os01t0103650-00.exon2 N/A Os01t0103650-00.exon2 N/A Os01t0103650-00.exon2 transcript:Os01t0103650-00 N/A 2 N/A
|
||||
1 irgsp exon 188455 188586 . 1 . N/A N/A 1 N/A -1 -1 Os01t0103650-00.exon3 N/A Os01t0103650-00.exon3 N/A Os01t0103650-00.exon3 transcript:Os01t0103650-00 N/A 3 N/A
|
||||
1 irgsp CDS 187547 187768 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103650-00 N/A N/A transcript:Os01t0103650-00 Os01t0103650-00 N/A N/A
|
||||
1 irgsp five_prime_UTR 187545 187546 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-44 N/A N/A transcript:Os01t0103650-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 187769 188020 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-41 N/A N/A transcript:Os01t0103650-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 188060 188385 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-42 N/A N/A transcript:Os01t0103650-00 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 188455 188586 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-43 N/A N/A transcript:Os01t0103650-00 N/A N/A N/A
|
||||
1 irgsp gene 191037 196287 . 1 . N/A protein_coding N/A Conserved hypothetical protein. (Os01t0103700-01) N/A N/A N/A Os01g0103700 gene:Os01g0103700 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 191037 196287 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103700-01 N/A N/A gene:Os01g0103700 N/A N/A Os01t0103700-01
|
||||
1 irgsp exon 191037 191161 . 1 . N/A N/A 1 N/A -1 -1 Os01t0103700-01.exon1 N/A Os01t0103700-01.exon1 N/A Os01t0103700-01.exon1 transcript:Os01t0103700-01 N/A 1 N/A
|
||||
1 irgsp exon 191625 191705 . 1 . N/A N/A 1 N/A 0 -1 Os01t0103700-01.exon2 N/A Os01t0103700-01.exon2 N/A Os01t0103700-01.exon2 transcript:Os01t0103700-01 N/A 2 N/A
|
||||
1 irgsp exon 192399 192506 . 1 . N/A N/A 1 N/A 0 0 Os01t0103700-01.exon3 N/A Os01t0103700-01.exon3 N/A Os01t0103700-01.exon3 transcript:Os01t0103700-01 N/A 3 N/A
|
||||
1 irgsp exon 192958 193161 . 1 . N/A N/A 1 N/A 0 0 Os01t0103700-01.exon4 N/A Os01t0103700-01.exon4 N/A Os01t0103700-01.exon4 transcript:Os01t0103700-01 N/A 4 N/A
|
||||
1 irgsp exon 193248 193356 . 1 . N/A N/A 1 N/A 1 0 Os01t0103700-01.exon5 N/A Os01t0103700-01.exon5 N/A Os01t0103700-01.exon5 transcript:Os01t0103700-01 N/A 5 N/A
|
||||
1 irgsp exon 193434 196287 . 1 . N/A N/A 1 N/A -1 1 Os01t0103700-01.exon6 N/A Os01t0103700-01.exon6 N/A Os01t0103700-01.exon6 transcript:Os01t0103700-01 N/A 6 N/A
|
||||
1 irgsp CDS 191694 191705 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103700-01 N/A N/A transcript:Os01t0103700-01 Os01t0103700-01 N/A N/A
|
||||
1 irgsp CDS 192399 192506 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103700-01 N/A N/A transcript:Os01t0103700-01 Os01t0103700-01 N/A N/A
|
||||
1 irgsp CDS 192958 193161 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103700-01 N/A N/A transcript:Os01t0103700-01 Os01t0103700-01 N/A N/A
|
||||
1 irgsp CDS 193248 193356 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103700-01 N/A N/A transcript:Os01t0103700-01 Os01t0103700-01 N/A N/A
|
||||
1 irgsp CDS 193434 193507 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103700-01 N/A N/A transcript:Os01t0103700-01 Os01t0103700-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 191037 191161 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-45 N/A N/A transcript:Os01t0103700-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 191625 191693 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-46 N/A N/A transcript:Os01t0103700-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 193508 196287 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-44 N/A N/A transcript:Os01t0103700-01 N/A N/A N/A
|
||||
1 irgsp gene 197647 200803 . 1 . N/A protein_coding N/A Conserved hypothetical protein. (Os01t0103800-01) N/A N/A N/A Os01g0103800 gene:Os01g0103800 irgspv1.0-20170804-genes OsDW1-01g N/A N/A N/A N/A
|
||||
1 irgsp mRNA 197647 200803 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103800-01 N/A N/A gene:Os01g0103800 N/A N/A Os01t0103800-01
|
||||
1 irgsp exon 197647 197838 . 1 . N/A N/A 1 N/A -1 -1 Os01t0103800-01.exon1 N/A Os01t0103800-01.exon1 N/A Os01t0103800-01.exon1 transcript:Os01t0103800-01 N/A 1 N/A
|
||||
1 irgsp exon 198034 198225 . 1 . N/A N/A 1 N/A 0 -1 Os01t0103800-01.exon2 N/A Os01t0103800-01.exon2 N/A Os01t0103800-01.exon2 transcript:Os01t0103800-01 N/A 2 N/A
|
||||
1 irgsp exon 198830 200036 . 1 . N/A N/A 1 N/A 1 0 Os01t0103800-01.exon3 N/A Os01t0103800-01.exon3 N/A Os01t0103800-01.exon3 transcript:Os01t0103800-01 N/A 3 N/A
|
||||
1 irgsp exon 200253 200803 . 1 . N/A N/A 1 N/A -1 1 Os01t0103800-01.exon4 N/A Os01t0103800-01.exon4 N/A Os01t0103800-01.exon4 transcript:Os01t0103800-01 N/A 4 N/A
|
||||
1 irgsp CDS 198130 198225 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103800-01 N/A N/A transcript:Os01t0103800-01 Os01t0103800-01 N/A N/A
|
||||
1 irgsp CDS 198830 200036 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103800-01 N/A N/A transcript:Os01t0103800-01 Os01t0103800-01 N/A N/A
|
||||
1 irgsp CDS 200253 200479 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103800-01 N/A N/A transcript:Os01t0103800-01 Os01t0103800-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 197647 197838 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-47 N/A N/A transcript:Os01t0103800-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 198034 198129 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-48 N/A N/A transcript:Os01t0103800-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 200480 200803 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-45 N/A N/A transcript:Os01t0103800-01 N/A N/A N/A
|
||||
1 irgsp gene 201944 206202 . 1 . N/A protein_coding N/A Polynucleotidyl transferase, Ribonuclease H fold domain containing protein. (Os01t0103900-01) N/A N/A N/A Os01g0103900 gene:Os01g0103900 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 201944 206202 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0103900-01 N/A N/A gene:Os01g0103900 N/A N/A Os01t0103900-01
|
||||
1 irgsp exon 201944 202110 . 1 . N/A N/A 1 N/A 0 -1 Os01t0103900-01.exon1 N/A Os01t0103900-01.exon1 N/A Os01t0103900-01.exon1 transcript:Os01t0103900-01 N/A 1 N/A
|
||||
1 irgsp exon 202252 202359 . 1 . N/A N/A 1 N/A 0 0 Os01t0103900-01.exon2 N/A Os01t0103900-01.exon2 N/A Os01t0103900-01.exon2 transcript:Os01t0103900-01 N/A 2 N/A
|
||||
1 irgsp exon 203007 203127 . 1 . N/A N/A 1 N/A 1 0 Os01t0103900-01.exon3 N/A Os01t0103900-01.exon3 N/A Os01t0103900-01.exon3 transcript:Os01t0103900-01 N/A 3 N/A
|
||||
1 irgsp exon 203302 203429 . 1 . N/A N/A 1 N/A 0 1 Os01t0103900-01.exon4 N/A Os01t0103900-01.exon4 N/A Os01t0103900-01.exon4 transcript:Os01t0103900-01 N/A 4 N/A
|
||||
1 irgsp exon 203511 203658 . 1 . N/A N/A 1 N/A 1 0 Os01t0103900-01.exon5 N/A Os01t0103900-01.exon5 N/A Os01t0103900-01.exon5 transcript:Os01t0103900-01 N/A 5 N/A
|
||||
1 irgsp exon 203760 203938 . 1 . N/A N/A 1 N/A 0 1 Os01t0103900-01.exon6 N/A Os01t0103900-01.exon6 N/A Os01t0103900-01.exon6 transcript:Os01t0103900-01 N/A 6 N/A
|
||||
1 irgsp exon 204203 204440 . 1 . N/A N/A 1 N/A 1 0 Os01t0103900-01.exon7 N/A Os01t0103900-01.exon7 N/A Os01t0103900-01.exon7 transcript:Os01t0103900-01 N/A 7 N/A
|
||||
1 irgsp exon 204543 204635 . 1 . N/A N/A 1 N/A 1 1 Os01t0103900-01.exon8 N/A Os01t0103900-01.exon8 N/A Os01t0103900-01.exon8 transcript:Os01t0103900-01 N/A 8 N/A
|
||||
1 irgsp exon 204730 204875 . 1 . N/A N/A 1 N/A 0 1 Os01t0103900-01.exon9 N/A Os01t0103900-01.exon9 N/A Os01t0103900-01.exon9 transcript:Os01t0103900-01 N/A 9 N/A
|
||||
1 irgsp exon 205042 205149 . 1 . N/A N/A 1 N/A 0 0 Os01t0103900-01.exon10 N/A Os01t0103900-01.exon10 N/A Os01t0103900-01.exon10 transcript:Os01t0103900-01 N/A 10 N/A
|
||||
1 irgsp exon 205290 205378 . 1 . N/A N/A 1 N/A 2 0 Os01t0103900-01.exon11 N/A Os01t0103900-01.exon11 N/A Os01t0103900-01.exon11 transcript:Os01t0103900-01 N/A 11 N/A
|
||||
1 irgsp exon 205534 206202 . 1 . N/A N/A 1 N/A -1 2 Os01t0103900-01.exon12 N/A Os01t0103900-01.exon12 N/A Os01t0103900-01.exon12 transcript:Os01t0103900-01 N/A 12 N/A
|
||||
1 irgsp CDS 202042 202110 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 202252 202359 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 203007 203127 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 203302 203429 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 203511 203658 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 203760 203938 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 204203 204440 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 204543 204635 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 204730 204875 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 205042 205149 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 205290 205378 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp CDS 205534 205543 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0103900-01 N/A N/A transcript:Os01t0103900-01 Os01t0103900-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 201944 202041 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-49 N/A N/A transcript:Os01t0103900-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 205544 206202 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-46 N/A N/A transcript:Os01t0103900-01 N/A N/A N/A
|
||||
1 irgsp gene 206131 209606 . -1 . N/A protein_coding N/A C-type lectin domain containing protein. (Os01t0104000-01);Similar to predicted protein. (Os01t0104000-02) N/A N/A N/A Os01g0104000 gene:Os01g0104000 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 206131 209581 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104000-02 N/A N/A gene:Os01g0104000 N/A N/A Os01t0104000-02
|
||||
1 irgsp exon 206131 207029 . -1 . N/A N/A 0 N/A -1 2 Os01t0104000-02.exon4 N/A Os01t0104000-02.exon4 N/A Os01t0104000-02.exon4 transcript:Os01t0104000-02 N/A 4 N/A
|
||||
1 irgsp exon 207706 208273 . -1 . N/A N/A 0 N/A 2 1 Os01t0104000-02.exon3 N/A Os01t0104000-02.exon3 N/A Os01t0104000-02.exon3 transcript:Os01t0104000-02 N/A 3 N/A
|
||||
1 irgsp exon 208408 208836 . -1 . N/A N/A 1 N/A 1 1 Os01t0104000-01.exon2 N/A Os01t0104000-01.exon2 N/A Os01t0104000-01.exon2 transcript:Os01t0104000-02 N/A 2 N/A
|
||||
1 irgsp exon 209438 209581 . -1 . N/A N/A 0 N/A 1 -1 Os01t0104000-02.exon1 N/A Os01t0104000-02.exon1 N/A Os01t0104000-02.exon1 transcript:Os01t0104000-02 N/A 1 N/A
|
||||
1 irgsp CDS 206450 207029 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104000-02 N/A N/A transcript:Os01t0104000-02 Os01t0104000-02 N/A N/A
|
||||
1 irgsp CDS 207706 208273 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104000-02 N/A N/A transcript:Os01t0104000-02 Os01t0104000-02 N/A N/A
|
||||
1 irgsp CDS 208408 208836 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104000-02 N/A N/A transcript:Os01t0104000-02 Os01t0104000-02 N/A N/A
|
||||
1 irgsp CDS 209438 209525 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104000-02 N/A N/A transcript:Os01t0104000-02 Os01t0104000-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 209526 209581 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-50 N/A N/A transcript:Os01t0104000-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 206131 206449 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-47 N/A N/A transcript:Os01t0104000-02 N/A N/A N/A
|
||||
1 irgsp mRNA 206134 209606 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104000-01 N/A N/A gene:Os01g0104000 N/A N/A Os01t0104000-01
|
||||
1 irgsp exon 206134 207029 . -1 . N/A N/A 0 N/A -1 2 Os01t0104000-01.exon4 N/A Os01t0104000-01.exon4 N/A Os01t0104000-01.exon4 transcript:Os01t0104000-01 N/A 4 N/A
|
||||
1 irgsp exon 207706 208276 . -1 . N/A N/A 0 N/A 2 1 Os01t0104000-01.exon3 N/A Os01t0104000-01.exon3 N/A Os01t0104000-01.exon3 transcript:Os01t0104000-01 N/A 3 N/A
|
||||
1 irgsp exon 208408 208836 . -1 . N/A N/A 1 N/A 1 1 Os01t0104000-01.exon2 N/A agat-exon-6 N/A Os01t0104000-01.exon2 transcript:Os01t0104000-01 N/A 2 N/A
|
||||
1 irgsp exon 209438 209606 . -1 . N/A N/A 0 N/A 1 -1 Os01t0104000-01.exon1 N/A Os01t0104000-01.exon1 N/A Os01t0104000-01.exon1 transcript:Os01t0104000-01 N/A 1 N/A
|
||||
1 irgsp CDS 206450 207029 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104000-01 N/A N/A transcript:Os01t0104000-01 Os01t0104000-01 N/A N/A
|
||||
1 irgsp CDS 207706 208276 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104000-01 N/A N/A transcript:Os01t0104000-01 Os01t0104000-01 N/A N/A
|
||||
1 irgsp CDS 208408 208836 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104000-01 N/A N/A transcript:Os01t0104000-01 Os01t0104000-01 N/A N/A
|
||||
1 irgsp CDS 209438 209525 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104000-01 N/A N/A transcript:Os01t0104000-01 Os01t0104000-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 209526 209606 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-51 N/A N/A transcript:Os01t0104000-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 206134 206449 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-48 N/A N/A transcript:Os01t0104000-01 N/A N/A N/A
|
||||
1 irgsp gene 209771 214173 . 1 . N/A protein_coding N/A Similar to protein binding / zinc ion binding. (Os01t0104100-01);Similar to protein binding / zinc ion binding. (Os01t0104100-02) N/A N/A N/A Os01g0104100 gene:Os01g0104100 irgspv1.0-20170804-genes cold-inducible, cold-inducible zinc finger protein N/A N/A N/A N/A
|
||||
1 irgsp mRNA 209771 214173 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104100-01 N/A N/A gene:Os01g0104100 N/A N/A Os01t0104100-01
|
||||
1 irgsp exon 209771 209896 . 1 . N/A N/A 0 N/A 0 0 Os01t0104100-01.exon1 N/A Os01t0104100-01.exon1 N/A Os01t0104100-01.exon1 transcript:Os01t0104100-01 N/A 1 N/A
|
||||
1 irgsp exon 210244 210563 . 1 . N/A N/A 1 N/A 2 0 Os01t0104100-01.exon2 N/A Os01t0104100-01.exon2 N/A Os01t0104100-01.exon2 transcript:Os01t0104100-01 N/A 2 N/A
|
||||
1 irgsp exon 210659 210890 . 1 . N/A N/A 1 N/A 0 2 Os01t0104100-01.exon3 N/A Os01t0104100-01.exon3 N/A Os01t0104100-01.exon3 transcript:Os01t0104100-01 N/A 3 N/A
|
||||
1 irgsp exon 211015 211160 . 1 . N/A N/A 1 N/A 2 0 Os01t0104100-01.exon4 N/A Os01t0104100-01.exon4 N/A Os01t0104100-01.exon4 transcript:Os01t0104100-01 N/A 4 N/A
|
||||
1 irgsp exon 212265 212352 . 1 . N/A N/A 1 N/A 0 2 Os01t0104100-01.exon5 N/A Os01t0104100-01.exon5 N/A Os01t0104100-01.exon5 transcript:Os01t0104100-01 N/A 5 N/A
|
||||
1 irgsp exon 212433 212579 . 1 . N/A N/A 1 N/A 0 0 Os01t0104100-01.exon6 N/A Os01t0104100-01.exon6 N/A Os01t0104100-01.exon6 transcript:Os01t0104100-01 N/A 6 N/A
|
||||
1 irgsp exon 213490 213639 . 1 . N/A N/A 1 N/A 0 0 Os01t0104100-01.exon7 N/A Os01t0104100-01.exon7 N/A Os01t0104100-01.exon7 transcript:Os01t0104100-01 N/A 7 N/A
|
||||
1 irgsp exon 213741 214173 . 1 . N/A N/A 0 N/A -1 0 Os01t0104100-01.exon8 N/A Os01t0104100-01.exon8 N/A Os01t0104100-01.exon8 transcript:Os01t0104100-01 N/A 8 N/A
|
||||
1 irgsp CDS 209771 209896 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-01 N/A N/A transcript:Os01t0104100-01 Os01t0104100-01 N/A N/A
|
||||
1 irgsp CDS 210244 210563 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-01 N/A N/A transcript:Os01t0104100-01 Os01t0104100-01 N/A N/A
|
||||
1 irgsp CDS 210659 210890 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-01 N/A N/A transcript:Os01t0104100-01 Os01t0104100-01 N/A N/A
|
||||
1 irgsp CDS 211015 211160 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-01 N/A N/A transcript:Os01t0104100-01 Os01t0104100-01 N/A N/A
|
||||
1 irgsp CDS 212265 212352 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-01 N/A N/A transcript:Os01t0104100-01 Os01t0104100-01 N/A N/A
|
||||
1 irgsp CDS 212433 212579 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-01 N/A N/A transcript:Os01t0104100-01 Os01t0104100-01 N/A N/A
|
||||
1 irgsp CDS 213490 213639 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-01 N/A N/A transcript:Os01t0104100-01 Os01t0104100-01 N/A N/A
|
||||
1 irgsp CDS 213741 213788 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-01 N/A N/A transcript:Os01t0104100-01 Os01t0104100-01 N/A N/A
|
||||
1 irgsp three_prime_UTR 213789 214173 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-49 N/A N/A transcript:Os01t0104100-01 N/A N/A N/A
|
||||
1 irgsp mRNA 209794 214147 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104100-02 N/A N/A gene:Os01g0104100 N/A N/A Os01t0104100-02
|
||||
1 irgsp exon 209794 209896 . 1 . N/A N/A 0 N/A 0 -1 Os01t0104100-02.exon1 N/A Os01t0104100-02.exon1 N/A Os01t0104100-02.exon1 transcript:Os01t0104100-02 N/A 1 N/A
|
||||
1 irgsp exon 210244 210563 . 1 . N/A N/A 1 N/A 2 0 Os01t0104100-01.exon2 N/A agat-exon-7 N/A Os01t0104100-01.exon2 transcript:Os01t0104100-02 N/A 2 N/A
|
||||
1 irgsp exon 210659 210890 . 1 . N/A N/A 1 N/A 0 2 Os01t0104100-01.exon3 N/A agat-exon-8 N/A Os01t0104100-01.exon3 transcript:Os01t0104100-02 N/A 3 N/A
|
||||
1 irgsp exon 211015 211160 . 1 . N/A N/A 1 N/A 2 0 Os01t0104100-01.exon4 N/A agat-exon-9 N/A Os01t0104100-01.exon4 transcript:Os01t0104100-02 N/A 4 N/A
|
||||
1 irgsp exon 212265 212352 . 1 . N/A N/A 1 N/A 0 2 Os01t0104100-01.exon5 N/A agat-exon-10 N/A Os01t0104100-01.exon5 transcript:Os01t0104100-02 N/A 5 N/A
|
||||
1 irgsp exon 212433 212579 . 1 . N/A N/A 1 N/A 0 0 Os01t0104100-01.exon6 N/A agat-exon-11 N/A Os01t0104100-01.exon6 transcript:Os01t0104100-02 N/A 6 N/A
|
||||
1 irgsp exon 213490 213639 . 1 . N/A N/A 1 N/A 0 0 Os01t0104100-01.exon7 N/A agat-exon-12 N/A Os01t0104100-01.exon7 transcript:Os01t0104100-02 N/A 7 N/A
|
||||
1 irgsp exon 213741 214147 . 1 . N/A N/A 0 N/A -1 0 Os01t0104100-02.exon8 N/A Os01t0104100-02.exon8 N/A Os01t0104100-02.exon8 transcript:Os01t0104100-02 N/A 8 N/A
|
||||
1 irgsp CDS 209795 209896 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-02 N/A N/A transcript:Os01t0104100-02 Os01t0104100-02 N/A N/A
|
||||
1 irgsp CDS 210244 210563 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-02 N/A N/A transcript:Os01t0104100-02 Os01t0104100-02 N/A N/A
|
||||
1 irgsp CDS 210659 210890 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-02 N/A N/A transcript:Os01t0104100-02 Os01t0104100-02 N/A N/A
|
||||
1 irgsp CDS 211015 211160 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-02 N/A N/A transcript:Os01t0104100-02 Os01t0104100-02 N/A N/A
|
||||
1 irgsp CDS 212265 212352 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-02 N/A N/A transcript:Os01t0104100-02 Os01t0104100-02 N/A N/A
|
||||
1 irgsp CDS 212433 212579 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-02 N/A N/A transcript:Os01t0104100-02 Os01t0104100-02 N/A N/A
|
||||
1 irgsp CDS 213490 213639 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-02 N/A N/A transcript:Os01t0104100-02 Os01t0104100-02 N/A N/A
|
||||
1 irgsp CDS 213741 213788 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104100-02 N/A N/A transcript:Os01t0104100-02 Os01t0104100-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 209794 209794 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-52 N/A N/A transcript:Os01t0104100-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 213789 214147 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-50 N/A N/A transcript:Os01t0104100-02 N/A N/A N/A
|
||||
1 irgsp gene 216212 217345 . 1 . N/A protein_coding N/A No apical meristem (NAM) protein domain containing protein. (Os01t0104200-00) N/A N/A N/A Os01g0104200 gene:Os01g0104200 irgspv1.0-20170804-genes NAC DOMAIN-CONTAINING PROTEIN 16 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 216212 217345 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104200-00 N/A N/A gene:Os01g0104200 N/A N/A Os01t0104200-00
|
||||
1 irgsp exon 216212 216769 . 1 . N/A N/A 1 N/A 0 0 Os01t0104200-00.exon1 N/A Os01t0104200-00.exon1 N/A Os01t0104200-00.exon1 transcript:Os01t0104200-00 N/A 1 N/A
|
||||
1 irgsp exon 216884 217345 . 1 . N/A N/A 1 N/A 0 0 Os01t0104200-00.exon2 N/A Os01t0104200-00.exon2 N/A Os01t0104200-00.exon2 transcript:Os01t0104200-00 N/A 2 N/A
|
||||
1 irgsp CDS 216212 216769 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104200-00 N/A N/A transcript:Os01t0104200-00 Os01t0104200-00 N/A N/A
|
||||
1 irgsp CDS 216884 217345 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104200-00 N/A N/A transcript:Os01t0104200-00 Os01t0104200-00 N/A N/A
|
||||
1 irgsp gene 226897 229301 . 1 . N/A protein_coding N/A Ricin B-related lectin domain containing protein. (Os01t0104400-01);Ricin B-related lectin domain containing protein. (Os01t0104400-02);Ricin B-related lectin domain containing protein. (Os01t0104400-03) N/A N/A N/A Os01g0104400 gene:Os01g0104400 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 226897 229229 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104400-01 N/A N/A gene:Os01g0104400 N/A N/A Os01t0104400-01
|
||||
1 irgsp exon 226897 227634 . 1 . N/A N/A 0 N/A 0 -1 Os01t0104400-01.exon1 N/A Os01t0104400-01.exon1 N/A Os01t0104400-01.exon1 transcript:Os01t0104400-01 N/A 1 N/A
|
||||
1 irgsp exon 227742 227864 . 1 . N/A N/A 1 N/A 0 0 Os01t0104400-03.exon2 N/A Os01t0104400-03.exon2 N/A Os01t0104400-03.exon2 transcript:Os01t0104400-01 N/A 2 N/A
|
||||
1 irgsp exon 228557 228785 . 1 . N/A N/A 1 N/A 1 0 Os01t0104400-03.exon3 N/A Os01t0104400-03.exon3 N/A Os01t0104400-03.exon3 transcript:Os01t0104400-01 N/A 3 N/A
|
||||
1 irgsp exon 228930 229229 . 1 . N/A N/A 0 N/A -1 1 Os01t0104400-01.exon4 N/A Os01t0104400-01.exon4 N/A Os01t0104400-01.exon4 transcript:Os01t0104400-01 N/A 4 N/A
|
||||
1 irgsp CDS 227182 227634 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-01 N/A N/A transcript:Os01t0104400-01 Os01t0104400-01 N/A N/A
|
||||
1 irgsp CDS 227742 227864 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-01 N/A N/A transcript:Os01t0104400-01 Os01t0104400-01 N/A N/A
|
||||
1 irgsp CDS 228557 228785 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-01 N/A N/A transcript:Os01t0104400-01 Os01t0104400-01 N/A N/A
|
||||
1 irgsp CDS 228930 228931 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-01 N/A N/A transcript:Os01t0104400-01 Os01t0104400-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 226897 227181 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-53 N/A N/A transcript:Os01t0104400-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 228932 229229 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-51 N/A N/A transcript:Os01t0104400-01 N/A N/A N/A
|
||||
1 irgsp mRNA 227139 229301 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104400-02 N/A N/A gene:Os01g0104400 N/A N/A Os01t0104400-02
|
||||
1 irgsp exon 227139 227634 . 1 . N/A N/A 0 N/A 0 -1 Os01t0104400-02.exon1 N/A Os01t0104400-02.exon1 N/A Os01t0104400-02.exon1 transcript:Os01t0104400-02 N/A 1 N/A
|
||||
1 irgsp exon 227742 227864 . 1 . N/A N/A 1 N/A 0 0 Os01t0104400-03.exon2 N/A agat-exon-13 N/A Os01t0104400-03.exon2 transcript:Os01t0104400-02 N/A 2 N/A
|
||||
1 irgsp exon 228557 228785 . 1 . N/A N/A 1 N/A 1 0 Os01t0104400-03.exon3 N/A agat-exon-14 N/A Os01t0104400-03.exon3 transcript:Os01t0104400-02 N/A 3 N/A
|
||||
1 irgsp exon 228930 229301 . 1 . N/A N/A 0 N/A -1 1 Os01t0104400-02.exon4 N/A Os01t0104400-02.exon4 N/A Os01t0104400-02.exon4 transcript:Os01t0104400-02 N/A 4 N/A
|
||||
1 irgsp CDS 227182 227634 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-02 N/A N/A transcript:Os01t0104400-02 Os01t0104400-02 N/A N/A
|
||||
1 irgsp CDS 227742 227864 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-02 N/A N/A transcript:Os01t0104400-02 Os01t0104400-02 N/A N/A
|
||||
1 irgsp CDS 228557 228785 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-02 N/A N/A transcript:Os01t0104400-02 Os01t0104400-02 N/A N/A
|
||||
1 irgsp CDS 228930 228931 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-02 N/A N/A transcript:Os01t0104400-02 Os01t0104400-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 227139 227181 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-54 N/A N/A transcript:Os01t0104400-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 228932 229301 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-52 N/A N/A transcript:Os01t0104400-02 N/A N/A N/A
|
||||
1 irgsp mRNA 227179 229214 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104400-03 N/A N/A gene:Os01g0104400 N/A N/A Os01t0104400-03
|
||||
1 irgsp exon 227179 227634 . 1 . N/A N/A 0 N/A 0 -1 Os01t0104400-03.exon1 N/A Os01t0104400-03.exon1 N/A Os01t0104400-03.exon1 transcript:Os01t0104400-03 N/A 1 N/A
|
||||
1 irgsp exon 227742 227864 . 1 . N/A N/A 1 N/A 0 0 Os01t0104400-03.exon2 N/A agat-exon-15 N/A Os01t0104400-03.exon2 transcript:Os01t0104400-03 N/A 2 N/A
|
||||
1 irgsp exon 228557 228785 . 1 . N/A N/A 1 N/A 1 0 Os01t0104400-03.exon3 N/A agat-exon-16 N/A Os01t0104400-03.exon3 transcript:Os01t0104400-03 N/A 3 N/A
|
||||
1 irgsp exon 228930 229214 . 1 . N/A N/A 0 N/A -1 1 Os01t0104400-03.exon4 N/A Os01t0104400-03.exon4 N/A Os01t0104400-03.exon4 transcript:Os01t0104400-03 N/A 4 N/A
|
||||
1 irgsp CDS 227182 227634 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-03 N/A N/A transcript:Os01t0104400-03 Os01t0104400-03 N/A N/A
|
||||
1 irgsp CDS 227742 227864 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-03 N/A N/A transcript:Os01t0104400-03 Os01t0104400-03 N/A N/A
|
||||
1 irgsp CDS 228557 228785 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-03 N/A N/A transcript:Os01t0104400-03 Os01t0104400-03 N/A N/A
|
||||
1 irgsp CDS 228930 228931 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104400-03 N/A N/A transcript:Os01t0104400-03 Os01t0104400-03 N/A N/A
|
||||
1 irgsp five_prime_UTR 227179 227181 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-55 N/A N/A transcript:Os01t0104400-03 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 228932 229214 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-53 N/A N/A transcript:Os01t0104400-03 N/A N/A N/A
|
||||
1 irgsp gene 241680 243440 . 1 . N/A protein_coding N/A No apical meristem (NAM) protein domain containing protein. (Os01t0104500-01) N/A N/A N/A Os01g0104500 gene:Os01g0104500 irgspv1.0-20170804-genes NAC DOMAIN-CONTAINING PROTEIN 20 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 241680 243440 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104500-01 N/A N/A gene:Os01g0104500 N/A N/A Os01t0104500-01
|
||||
1 irgsp exon 241680 241702 . 1 . N/A N/A 1 N/A -1 -1 Os01t0104500-01.exon1 N/A Os01t0104500-01.exon1 N/A Os01t0104500-01.exon1 transcript:Os01t0104500-01 N/A 1 N/A
|
||||
1 irgsp exon 241866 242091 . 1 . N/A N/A 1 N/A 1 -1 Os01t0104500-01.exon2 N/A Os01t0104500-01.exon2 N/A Os01t0104500-01.exon2 transcript:Os01t0104500-01 N/A 2 N/A
|
||||
1 irgsp exon 242199 243440 . 1 . N/A N/A 1 N/A -1 1 Os01t0104500-01.exon3 N/A Os01t0104500-01.exon3 N/A Os01t0104500-01.exon3 transcript:Os01t0104500-01 N/A 3 N/A
|
||||
1 irgsp CDS 241908 242091 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104500-01 N/A N/A transcript:Os01t0104500-01 Os01t0104500-01 N/A N/A
|
||||
1 irgsp CDS 242199 242977 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104500-01 N/A N/A transcript:Os01t0104500-01 Os01t0104500-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 241680 241702 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-56 N/A N/A transcript:Os01t0104500-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 241866 241907 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-57 N/A N/A transcript:Os01t0104500-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 242978 243440 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-54 N/A N/A transcript:Os01t0104500-01 N/A N/A N/A
|
||||
1 irgsp gene 248828 256872 . -1 . N/A protein_coding N/A Homolog of Arabidopsis DE-ETIOLATED1 (DET1), Modulation of the ABA signaling pathway and ABA biosynthesis, Regulation of chlorophyll content (Os01t0104600-01);Similar to Light-mediated development protein DET1 (Deetiolated1 homolog) (tDET1) (High pigmentation protein 2) (Protein dark green). (Os01t0104600-02) N/A N/A N/A Os01g0104600 gene:Os01g0104600 irgspv1.0-20170804-genes DE-ETIOLATED1 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 248828 256571 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104600-02 N/A N/A gene:Os01g0104600 N/A N/A Os01t0104600-02
|
||||
1 irgsp exon 248828 249107 . -1 . N/A N/A 1 N/A -1 1 Os01t0104600-01.exon11 N/A Os01t0104600-01.exon11 N/A Os01t0104600-01.exon11 transcript:Os01t0104600-02 N/A 11 N/A
|
||||
1 irgsp exon 249369 249468 . -1 . N/A N/A 1 N/A 1 0 Os01t0104600-01.exon10 N/A Os01t0104600-01.exon10 N/A Os01t0104600-01.exon10 transcript:Os01t0104600-02 N/A 10 N/A
|
||||
1 irgsp exon 249861 249956 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon9 N/A Os01t0104600-01.exon9 N/A Os01t0104600-01.exon9 transcript:Os01t0104600-02 N/A 9 N/A
|
||||
1 irgsp exon 250617 250781 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon8 N/A Os01t0104600-01.exon8 N/A Os01t0104600-01.exon8 transcript:Os01t0104600-02 N/A 8 N/A
|
||||
1 irgsp exon 250860 250940 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon7 N/A Os01t0104600-01.exon7 N/A Os01t0104600-01.exon7 transcript:Os01t0104600-02 N/A 7 N/A
|
||||
1 irgsp exon 251026 251082 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon6 N/A Os01t0104600-01.exon6 N/A Os01t0104600-01.exon6 transcript:Os01t0104600-02 N/A 6 N/A
|
||||
1 irgsp exon 251316 251384 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon5 N/A Os01t0104600-01.exon5 N/A Os01t0104600-01.exon5 transcript:Os01t0104600-02 N/A 5 N/A
|
||||
1 irgsp exon 251695 251790 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon4 N/A Os01t0104600-01.exon4 N/A Os01t0104600-01.exon4 transcript:Os01t0104600-02 N/A 4 N/A
|
||||
1 irgsp exon 255325 255553 . -1 . N/A N/A 1 N/A 0 2 Os01t0104600-01.exon3 N/A Os01t0104600-01.exon3 N/A Os01t0104600-01.exon3 transcript:Os01t0104600-02 N/A 3 N/A
|
||||
1 irgsp exon 255674 256098 . -1 . N/A N/A 1 N/A 2 0 Os01t0104600-01.exon2 N/A Os01t0104600-01.exon2 N/A Os01t0104600-01.exon2 transcript:Os01t0104600-02 N/A 2 N/A
|
||||
1 irgsp exon 256361 256571 . -1 . N/A N/A 0 N/A 0 -1 Os01t0104600-02.exon1 N/A Os01t0104600-02.exon1 N/A Os01t0104600-02.exon1 transcript:Os01t0104600-02 N/A 1 N/A
|
||||
1 irgsp CDS 248971 249107 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 249369 249468 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 249861 249956 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 250617 250781 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 250860 250940 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 251026 251082 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 251316 251384 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 251695 251790 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 255325 255553 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 255674 256098 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp CDS 256361 256441 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-02 N/A N/A transcript:Os01t0104600-02 Os01t0104600-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 256442 256571 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-58 N/A N/A transcript:Os01t0104600-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 248828 248970 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-55 N/A N/A transcript:Os01t0104600-02 N/A N/A N/A
|
||||
1 irgsp mRNA 248828 256872 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104600-01 N/A N/A gene:Os01g0104600 N/A N/A Os01t0104600-01
|
||||
1 irgsp exon 248828 249107 . -1 . N/A N/A 1 N/A -1 1 Os01t0104600-01.exon11 N/A agat-exon-17 N/A Os01t0104600-01.exon11 transcript:Os01t0104600-01 N/A 11 N/A
|
||||
1 irgsp exon 249369 249468 . -1 . N/A N/A 1 N/A 1 0 Os01t0104600-01.exon10 N/A agat-exon-18 N/A Os01t0104600-01.exon10 transcript:Os01t0104600-01 N/A 10 N/A
|
||||
1 irgsp exon 249861 249956 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon9 N/A agat-exon-19 N/A Os01t0104600-01.exon9 transcript:Os01t0104600-01 N/A 9 N/A
|
||||
1 irgsp exon 250617 250781 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon8 N/A agat-exon-20 N/A Os01t0104600-01.exon8 transcript:Os01t0104600-01 N/A 8 N/A
|
||||
1 irgsp exon 250860 250940 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon7 N/A agat-exon-21 N/A Os01t0104600-01.exon7 transcript:Os01t0104600-01 N/A 7 N/A
|
||||
1 irgsp exon 251026 251082 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon6 N/A agat-exon-22 N/A Os01t0104600-01.exon6 transcript:Os01t0104600-01 N/A 6 N/A
|
||||
1 irgsp exon 251316 251384 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon5 N/A agat-exon-23 N/A Os01t0104600-01.exon5 transcript:Os01t0104600-01 N/A 5 N/A
|
||||
1 irgsp exon 251695 251790 . -1 . N/A N/A 1 N/A 0 0 Os01t0104600-01.exon4 N/A agat-exon-24 N/A Os01t0104600-01.exon4 transcript:Os01t0104600-01 N/A 4 N/A
|
||||
1 irgsp exon 255325 255553 . -1 . N/A N/A 1 N/A 0 2 Os01t0104600-01.exon3 N/A agat-exon-25 N/A Os01t0104600-01.exon3 transcript:Os01t0104600-01 N/A 3 N/A
|
||||
1 irgsp exon 255674 256098 . -1 . N/A N/A 1 N/A 2 0 Os01t0104600-01.exon2 N/A agat-exon-26 N/A Os01t0104600-01.exon2 transcript:Os01t0104600-01 N/A 2 N/A
|
||||
1 irgsp exon 256361 256872 . -1 . N/A N/A 0 N/A 0 -1 Os01t0104600-01.exon1 N/A Os01t0104600-01.exon1 N/A Os01t0104600-01.exon1 transcript:Os01t0104600-01 N/A 1 N/A
|
||||
1 irgsp CDS 248971 249107 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 249369 249468 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 249861 249956 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 250617 250781 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 250860 250940 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 251026 251082 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 251316 251384 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 251695 251790 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 255325 255553 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 255674 256098 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp CDS 256361 256441 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104600-01 N/A N/A transcript:Os01t0104600-01 Os01t0104600-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 256442 256872 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-59 N/A N/A transcript:Os01t0104600-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 248828 248970 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-56 N/A N/A transcript:Os01t0104600-01 N/A N/A N/A
|
||||
1 irgsp gene 261530 268145 . 1 . N/A protein_coding N/A Sas10/Utp3 family protein. (Os01t0104800-01);Hypothetical conserved gene. (Os01t0104800-02) N/A N/A N/A Os01g0104800 gene:Os01g0104800 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 261530 268145 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104800-01 N/A N/A gene:Os01g0104800 N/A N/A Os01t0104800-01
|
||||
1 irgsp exon 261530 261661 . 1 . N/A N/A 0 N/A 1 -1 Os01t0104800-01.exon1 N/A Os01t0104800-01.exon1 N/A Os01t0104800-01.exon1 transcript:Os01t0104800-01 N/A 1 N/A
|
||||
1 irgsp exon 261767 261805 . 1 . N/A N/A 0 N/A 1 1 Os01t0104800-01.exon2 N/A Os01t0104800-01.exon2 N/A Os01t0104800-01.exon2 transcript:Os01t0104800-01 N/A 2 N/A
|
||||
1 irgsp exon 261895 261941 . 1 . N/A N/A 0 N/A 0 1 Os01t0104800-01.exon3 N/A Os01t0104800-01.exon3 N/A Os01t0104800-01.exon3 transcript:Os01t0104800-01 N/A 3 N/A
|
||||
1 irgsp exon 262582 262681 . 1 . N/A N/A 0 N/A 1 0 Os01t0104800-01.exon4 N/A Os01t0104800-01.exon4 N/A Os01t0104800-01.exon4 transcript:Os01t0104800-01 N/A 4 N/A
|
||||
1 irgsp exon 262925 263181 . 1 . N/A N/A 0 N/A 0 1 Os01t0104800-01.exon5 N/A Os01t0104800-01.exon5 N/A Os01t0104800-01.exon5 transcript:Os01t0104800-01 N/A 5 N/A
|
||||
1 irgsp exon 263525 263640 . 1 . N/A N/A 0 N/A 2 0 Os01t0104800-01.exon6 N/A Os01t0104800-01.exon6 N/A Os01t0104800-01.exon6 transcript:Os01t0104800-01 N/A 6 N/A
|
||||
1 irgsp exon 264014 264098 . 1 . N/A N/A 1 N/A 0 2 Os01t0104800-01.exon7 N/A Os01t0104800-01.exon7 N/A Os01t0104800-01.exon7 transcript:Os01t0104800-01 N/A 7 N/A
|
||||
1 irgsp exon 265236 265415 . 1 . N/A N/A 1 N/A 0 0 Os01t0104800-01.exon8 N/A Os01t0104800-01.exon8 N/A Os01t0104800-01.exon8 transcript:Os01t0104800-01 N/A 8 N/A
|
||||
1 irgsp exon 265506 265649 . 1 . N/A N/A 1 N/A 0 0 Os01t0104800-01.exon9 N/A Os01t0104800-01.exon9 N/A Os01t0104800-01.exon9 transcript:Os01t0104800-01 N/A 9 N/A
|
||||
1 irgsp exon 265740 265817 . 1 . N/A N/A 1 N/A 0 0 Os01t0104800-01.exon10 N/A Os01t0104800-01.exon10 N/A Os01t0104800-01.exon10 transcript:Os01t0104800-01 N/A 10 N/A
|
||||
1 irgsp exon 265909 266045 . 1 . N/A N/A 1 N/A 2 0 Os01t0104800-01.exon11 N/A Os01t0104800-01.exon11 N/A Os01t0104800-01.exon11 transcript:Os01t0104800-01 N/A 11 N/A
|
||||
1 irgsp exon 266138 266246 . 1 . N/A N/A 1 N/A 0 2 Os01t0104800-01.exon12 N/A Os01t0104800-01.exon12 N/A Os01t0104800-01.exon12 transcript:Os01t0104800-01 N/A 12 N/A
|
||||
1 irgsp exon 267237 267514 . 1 . N/A N/A 1 N/A 2 0 Os01t0104800-01.exon13 N/A Os01t0104800-01.exon13 N/A Os01t0104800-01.exon13 transcript:Os01t0104800-01 N/A 13 N/A
|
||||
1 irgsp exon 267591 267657 . 1 . N/A N/A 1 N/A 0 2 Os01t0104800-01.exon14 N/A Os01t0104800-01.exon14 N/A Os01t0104800-01.exon14 transcript:Os01t0104800-01 N/A 14 N/A
|
||||
1 irgsp exon 267734 267802 . 1 . N/A N/A 1 N/A 0 0 Os01t0104800-01.exon15 N/A Os01t0104800-01.exon15 N/A Os01t0104800-01.exon15 transcript:Os01t0104800-01 N/A 15 N/A
|
||||
1 irgsp exon 267880 268145 . 1 . N/A N/A 0 N/A -1 0 Os01t0104800-01.exon16 N/A Os01t0104800-01.exon16 N/A Os01t0104800-01.exon16 transcript:Os01t0104800-01 N/A 16 N/A
|
||||
1 irgsp CDS 261562 261661 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 261767 261805 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 261895 261941 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 262582 262681 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 262925 263181 . 1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 263525 263640 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 264014 264098 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 265236 265415 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 265506 265649 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 265740 265817 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 265909 266045 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 266138 266246 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 267237 267514 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 267591 267657 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 267734 267802 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp CDS 267880 268011 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-01 N/A N/A transcript:Os01t0104800-01 Os01t0104800-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 261530 261561 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-60 N/A N/A transcript:Os01t0104800-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 268012 268145 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-57 N/A N/A transcript:Os01t0104800-01 N/A N/A N/A
|
||||
1 irgsp mRNA 263523 268120 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104800-02 N/A N/A gene:Os01g0104800 N/A N/A Os01t0104800-02
|
||||
1 irgsp exon 263523 263640 . 1 . N/A N/A 0 N/A 2 -1 Os01t0104800-02.exon1 N/A Os01t0104800-02.exon1 N/A Os01t0104800-02.exon1 transcript:Os01t0104800-02 N/A 1 N/A
|
||||
1 irgsp exon 264014 264098 . 1 . N/A N/A 1 N/A 0 2 Os01t0104800-01.exon7 N/A agat-exon-27 N/A Os01t0104800-01.exon7 transcript:Os01t0104800-02 N/A 2 N/A
|
||||
1 irgsp exon 265236 265415 . 1 . N/A N/A 1 N/A 0 0 Os01t0104800-01.exon8 N/A agat-exon-28 N/A Os01t0104800-01.exon8 transcript:Os01t0104800-02 N/A 3 N/A
|
||||
1 irgsp exon 265506 265649 . 1 . N/A N/A 1 N/A 0 0 Os01t0104800-01.exon9 N/A agat-exon-29 N/A Os01t0104800-01.exon9 transcript:Os01t0104800-02 N/A 4 N/A
|
||||
1 irgsp exon 265740 265817 . 1 . N/A N/A 1 N/A 0 0 Os01t0104800-01.exon10 N/A agat-exon-30 N/A Os01t0104800-01.exon10 transcript:Os01t0104800-02 N/A 5 N/A
|
||||
1 irgsp exon 265909 266045 . 1 . N/A N/A 1 N/A 2 0 Os01t0104800-01.exon11 N/A agat-exon-31 N/A Os01t0104800-01.exon11 transcript:Os01t0104800-02 N/A 6 N/A
|
||||
1 irgsp exon 266138 266246 . 1 . N/A N/A 1 N/A 0 2 Os01t0104800-01.exon12 N/A agat-exon-32 N/A Os01t0104800-01.exon12 transcript:Os01t0104800-02 N/A 7 N/A
|
||||
1 irgsp exon 267237 267514 . 1 . N/A N/A 1 N/A 2 0 Os01t0104800-01.exon13 N/A agat-exon-33 N/A Os01t0104800-01.exon13 transcript:Os01t0104800-02 N/A 8 N/A
|
||||
1 irgsp exon 267591 267657 . 1 . N/A N/A 1 N/A 0 2 Os01t0104800-01.exon14 N/A agat-exon-34 N/A Os01t0104800-01.exon14 transcript:Os01t0104800-02 N/A 9 N/A
|
||||
1 irgsp exon 267734 267802 . 1 . N/A N/A 1 N/A 0 0 Os01t0104800-01.exon15 N/A agat-exon-35 N/A Os01t0104800-01.exon15 transcript:Os01t0104800-02 N/A 10 N/A
|
||||
1 irgsp exon 267880 268120 . 1 . N/A N/A 0 N/A -1 0 Os01t0104800-02.exon11 N/A Os01t0104800-02.exon11 N/A Os01t0104800-02.exon11 transcript:Os01t0104800-02 N/A 11 N/A
|
||||
1 irgsp CDS 263525 263640 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 264014 264098 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 265236 265415 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 265506 265649 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 265740 265817 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 265909 266045 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 266138 266246 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 267237 267514 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 267591 267657 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 267734 267802 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp CDS 267880 268011 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104800-02 N/A N/A transcript:Os01t0104800-02 Os01t0104800-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 263523 263524 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-61 N/A N/A transcript:Os01t0104800-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 268012 268120 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-58 N/A N/A transcript:Os01t0104800-02 N/A N/A N/A
|
||||
1 irgsp gene 270179 275084 . -1 . N/A protein_coding N/A Transferase family protein. (Os01t0104900-01);Hypothetical conserved gene. (Os01t0104900-02) N/A N/A N/A Os01g0104900 gene:Os01g0104900 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 270179 275084 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104900-01 N/A N/A gene:Os01g0104900 N/A N/A Os01t0104900-01
|
||||
1 irgsp exon 270179 271333 . -1 . N/A N/A 0 N/A -1 0 Os01t0104900-01.exon2 N/A Os01t0104900-01.exon2 N/A Os01t0104900-01.exon2 transcript:Os01t0104900-01 N/A 2 N/A
|
||||
1 irgsp exon 274529 275084 . -1 . N/A N/A 0 N/A 0 -1 Os01t0104900-01.exon1 N/A Os01t0104900-01.exon1 N/A Os01t0104900-01.exon1 transcript:Os01t0104900-01 N/A 1 N/A
|
||||
1 irgsp CDS 270356 271333 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104900-01 N/A N/A transcript:Os01t0104900-01 Os01t0104900-01 N/A N/A
|
||||
1 irgsp CDS 274529 274957 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104900-01 N/A N/A transcript:Os01t0104900-01 Os01t0104900-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 274958 275084 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-62 N/A N/A transcript:Os01t0104900-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 270179 270355 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-59 N/A N/A transcript:Os01t0104900-01 N/A N/A N/A
|
||||
1 irgsp mRNA 270250 271518 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0104900-02 N/A N/A gene:Os01g0104900 N/A N/A Os01t0104900-02
|
||||
1 irgsp exon 270250 271333 . -1 . N/A N/A 0 N/A -1 -1 Os01t0104900-02.exon2 N/A Os01t0104900-02.exon2 N/A Os01t0104900-02.exon2 transcript:Os01t0104900-02 N/A 2 N/A
|
||||
1 irgsp exon 271457 271518 . -1 . N/A N/A 0 N/A -1 -1 Os01t0104900-02.exon1 N/A Os01t0104900-02.exon1 N/A Os01t0104900-02.exon1 transcript:Os01t0104900-02 N/A 1 N/A
|
||||
1 irgsp CDS 270356 271309 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0104900-02 N/A N/A transcript:Os01t0104900-02 Os01t0104900-02 N/A N/A
|
||||
1 irgsp five_prime_UTR 271310 271333 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-63 N/A N/A transcript:Os01t0104900-02 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 271457 271518 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-64 N/A N/A transcript:Os01t0104900-02 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 270250 270355 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-60 N/A N/A transcript:Os01t0104900-02 N/A N/A N/A
|
||||
1 irgsp gene 284762 291892 . -1 . N/A protein_coding N/A Similar to HAT family dimerisation domain containing protein, expressed. (Os01t0105300-01) N/A N/A N/A Os01g0105300 gene:Os01g0105300 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 284762 291892 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0105300-01 N/A N/A gene:Os01g0105300 N/A N/A Os01t0105300-01
|
||||
1 irgsp exon 284762 287047 . -1 . N/A N/A 1 N/A -1 -1 Os01t0105300-01.exon5 N/A Os01t0105300-01.exon5 N/A Os01t0105300-01.exon5 transcript:Os01t0105300-01 N/A 5 N/A
|
||||
1 irgsp exon 291398 291436 . -1 . N/A N/A 1 N/A -1 -1 Os01t0105300-01.exon4 N/A Os01t0105300-01.exon4 N/A Os01t0105300-01.exon4 transcript:Os01t0105300-01 N/A 4 N/A
|
||||
1 irgsp exon 291520 291534 . -1 . N/A N/A 1 N/A -1 -1 Os01t0105300-01.exon3 N/A Os01t0105300-01.exon3 N/A Os01t0105300-01.exon3 transcript:Os01t0105300-01 N/A 3 N/A
|
||||
1 irgsp exon 291678 291738 . -1 . N/A N/A 1 N/A -1 -1 Os01t0105300-01.exon2 N/A Os01t0105300-01.exon2 N/A Os01t0105300-01.exon2 transcript:Os01t0105300-01 N/A 2 N/A
|
||||
1 irgsp exon 291838 291892 . -1 . N/A N/A 1 N/A -1 -1 Os01t0105300-01.exon1 N/A Os01t0105300-01.exon1 N/A Os01t0105300-01.exon1 transcript:Os01t0105300-01 N/A 1 N/A
|
||||
1 irgsp CDS 284931 285020 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105300-01 N/A N/A transcript:Os01t0105300-01 Os01t0105300-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 285021 287047 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-65 N/A N/A transcript:Os01t0105300-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 291398 291436 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-66 N/A N/A transcript:Os01t0105300-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 291520 291534 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-67 N/A N/A transcript:Os01t0105300-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 291678 291738 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-68 N/A N/A transcript:Os01t0105300-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 291838 291892 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-69 N/A N/A transcript:Os01t0105300-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 284762 284930 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-61 N/A N/A transcript:Os01t0105300-01 N/A N/A N/A
|
||||
1 irgsp gene 288372 292296 . 1 . N/A protein_coding N/A Similar to Kinesin heavy chain. (Os01t0105400-01) N/A N/A N/A Os01g0105400 gene:Os01g0105400 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 288372 292296 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0105400-01 N/A N/A gene:Os01g0105400 N/A N/A Os01t0105400-01
|
||||
1 irgsp exon 288372 288846 . 1 . N/A N/A 1 N/A -1 -1 Os01t0105400-01.exon1 N/A Os01t0105400-01.exon1 N/A Os01t0105400-01.exon1 transcript:Os01t0105400-01 N/A 1 N/A
|
||||
1 irgsp exon 288950 289116 . 1 . N/A N/A 1 N/A -1 -1 Os01t0105400-01.exon2 N/A Os01t0105400-01.exon2 N/A Os01t0105400-01.exon2 transcript:Os01t0105400-01 N/A 2 N/A
|
||||
1 irgsp exon 289202 289572 . 1 . N/A N/A 1 N/A -1 -1 Os01t0105400-01.exon3 N/A Os01t0105400-01.exon3 N/A Os01t0105400-01.exon3 transcript:Os01t0105400-01 N/A 3 N/A
|
||||
1 irgsp exon 289661 289830 . 1 . N/A N/A 1 N/A -1 -1 Os01t0105400-01.exon4 N/A Os01t0105400-01.exon4 N/A Os01t0105400-01.exon4 transcript:Os01t0105400-01 N/A 4 N/A
|
||||
1 irgsp exon 290395 290512 . 1 . N/A N/A 1 N/A 2 -1 Os01t0105400-01.exon5 N/A Os01t0105400-01.exon5 N/A Os01t0105400-01.exon5 transcript:Os01t0105400-01 N/A 5 N/A
|
||||
1 irgsp exon 291372 291574 . 1 . N/A N/A 1 N/A -1 2 Os01t0105400-01.exon6 N/A Os01t0105400-01.exon6 N/A Os01t0105400-01.exon6 transcript:Os01t0105400-01 N/A 6 N/A
|
||||
1 irgsp exon 291648 291779 . 1 . N/A N/A 1 N/A -1 -1 Os01t0105400-01.exon7 N/A Os01t0105400-01.exon7 N/A Os01t0105400-01.exon7 transcript:Os01t0105400-01 N/A 7 N/A
|
||||
1 irgsp exon 291859 291948 . 1 . N/A N/A 1 N/A -1 -1 Os01t0105400-01.exon8 N/A Os01t0105400-01.exon8 N/A Os01t0105400-01.exon8 transcript:Os01t0105400-01 N/A 8 N/A
|
||||
1 irgsp exon 292073 292296 . 1 . N/A N/A 1 N/A -1 -1 Os01t0105400-01.exon9 N/A Os01t0105400-01.exon9 N/A Os01t0105400-01.exon9 transcript:Os01t0105400-01 N/A 9 N/A
|
||||
1 irgsp CDS 290433 290512 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105400-01 N/A N/A transcript:Os01t0105400-01 Os01t0105400-01 N/A N/A
|
||||
1 irgsp CDS 291372 291558 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105400-01 N/A N/A transcript:Os01t0105400-01 Os01t0105400-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 288372 288846 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-70 N/A N/A transcript:Os01t0105400-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 288950 289116 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-71 N/A N/A transcript:Os01t0105400-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 289202 289572 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-72 N/A N/A transcript:Os01t0105400-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 289661 289830 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-73 N/A N/A transcript:Os01t0105400-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 290395 290432 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-74 N/A N/A transcript:Os01t0105400-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 291559 291574 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-62 N/A N/A transcript:Os01t0105400-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 291648 291779 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-63 N/A N/A transcript:Os01t0105400-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 291859 291948 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-64 N/A N/A transcript:Os01t0105400-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 292073 292296 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-65 N/A N/A transcript:Os01t0105400-01 N/A N/A N/A
|
||||
1 irgsp gene 303233 306736 . 1 . N/A protein_coding N/A Basic helix-loop-helix dimerisation region bHLH domain containing protein. (Os01t0105700-01) N/A N/A N/A Os01g0105700 gene:Os01g0105700 irgspv1.0-20170804-genes basic helix-loop-helix protein 071 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 303233 306736 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0105700-01 N/A N/A gene:Os01g0105700 N/A N/A Os01t0105700-01
|
||||
1 irgsp exon 303233 303471 . 1 . N/A N/A 1 N/A 2 -1 Os01t0105700-01.exon1 N/A Os01t0105700-01.exon1 N/A Os01t0105700-01.exon1 transcript:Os01t0105700-01 N/A 1 N/A
|
||||
1 irgsp exon 303981 304509 . 1 . N/A N/A 1 N/A 0 2 Os01t0105700-01.exon2 N/A Os01t0105700-01.exon2 N/A Os01t0105700-01.exon2 transcript:Os01t0105700-01 N/A 2 N/A
|
||||
1 irgsp exon 305572 305718 . 1 . N/A N/A 1 N/A 0 0 Os01t0105700-01.exon3 N/A Os01t0105700-01.exon3 N/A Os01t0105700-01.exon3 transcript:Os01t0105700-01 N/A 3 N/A
|
||||
1 irgsp exon 305834 305899 . 1 . N/A N/A 1 N/A 0 0 Os01t0105700-01.exon4 N/A Os01t0105700-01.exon4 N/A Os01t0105700-01.exon4 transcript:Os01t0105700-01 N/A 4 N/A
|
||||
1 irgsp exon 305993 306058 . 1 . N/A N/A 1 N/A 0 0 Os01t0105700-01.exon5 N/A Os01t0105700-01.exon5 N/A Os01t0105700-01.exon5 transcript:Os01t0105700-01 N/A 5 N/A
|
||||
1 irgsp exon 306171 306245 . 1 . N/A N/A 1 N/A 0 0 Os01t0105700-01.exon6 N/A Os01t0105700-01.exon6 N/A Os01t0105700-01.exon6 transcript:Os01t0105700-01 N/A 6 N/A
|
||||
1 irgsp exon 306353 306736 . 1 . N/A N/A 1 N/A -1 0 Os01t0105700-01.exon7 N/A Os01t0105700-01.exon7 N/A Os01t0105700-01.exon7 transcript:Os01t0105700-01 N/A 7 N/A
|
||||
1 irgsp CDS 303329 303471 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105700-01 N/A N/A transcript:Os01t0105700-01 Os01t0105700-01 N/A N/A
|
||||
1 irgsp CDS 303981 304509 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105700-01 N/A N/A transcript:Os01t0105700-01 Os01t0105700-01 N/A N/A
|
||||
1 irgsp CDS 305572 305718 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105700-01 N/A N/A transcript:Os01t0105700-01 Os01t0105700-01 N/A N/A
|
||||
1 irgsp CDS 305834 305899 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105700-01 N/A N/A transcript:Os01t0105700-01 Os01t0105700-01 N/A N/A
|
||||
1 irgsp CDS 305993 306058 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105700-01 N/A N/A transcript:Os01t0105700-01 Os01t0105700-01 N/A N/A
|
||||
1 irgsp CDS 306171 306245 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105700-01 N/A N/A transcript:Os01t0105700-01 Os01t0105700-01 N/A N/A
|
||||
1 irgsp CDS 306353 306493 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105700-01 N/A N/A transcript:Os01t0105700-01 Os01t0105700-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 303233 303328 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-75 N/A N/A transcript:Os01t0105700-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 306494 306736 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-66 N/A N/A transcript:Os01t0105700-01 N/A N/A N/A
|
||||
1 irgsp gene 306871 308842 . -1 . N/A protein_coding N/A Similar to Iron sulfur assembly protein 1. (Os01t0105800-01) N/A N/A N/A Os01g0105800 gene:Os01g0105800 irgspv1.0-20170804-genes IRON-SULFUR CLUSTER PROTEIN 9 N/A N/A N/A N/A
|
||||
1 irgsp mRNA 306871 308842 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0105800-01 N/A N/A gene:Os01g0105800 N/A N/A Os01t0105800-01
|
||||
1 irgsp exon 306871 307217 . -1 . N/A N/A 1 N/A -1 2 Os01t0105800-01.exon4 N/A Os01t0105800-01.exon4 N/A Os01t0105800-01.exon4 transcript:Os01t0105800-01 N/A 4 N/A
|
||||
1 irgsp exon 307296 307413 . -1 . N/A N/A 1 N/A 2 1 Os01t0105800-01.exon3 N/A Os01t0105800-01.exon3 N/A Os01t0105800-01.exon3 transcript:Os01t0105800-01 N/A 3 N/A
|
||||
1 irgsp exon 308397 308626 . -1 . N/A N/A 1 N/A 1 -1 Os01t0105800-01.exon2 N/A Os01t0105800-01.exon2 N/A Os01t0105800-01.exon2 transcript:Os01t0105800-01 N/A 2 N/A
|
||||
1 irgsp exon 308703 308842 . -1 . N/A N/A 1 N/A -1 -1 Os01t0105800-01.exon1 N/A Os01t0105800-01.exon1 N/A Os01t0105800-01.exon1 transcript:Os01t0105800-01 N/A 1 N/A
|
||||
1 irgsp CDS 307124 307217 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105800-01 N/A N/A transcript:Os01t0105800-01 Os01t0105800-01 N/A N/A
|
||||
1 irgsp CDS 307296 307413 . -1 2 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105800-01 N/A N/A transcript:Os01t0105800-01 Os01t0105800-01 N/A N/A
|
||||
1 irgsp CDS 308397 308601 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105800-01 N/A N/A transcript:Os01t0105800-01 Os01t0105800-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 308602 308626 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-76 N/A N/A transcript:Os01t0105800-01 N/A N/A N/A
|
||||
1 irgsp five_prime_UTR 308703 308842 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-77 N/A N/A transcript:Os01t0105800-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 306871 307123 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-67 N/A N/A transcript:Os01t0105800-01 N/A N/A N/A
|
||||
1 irgsp gene 309520 313170 . -1 . N/A protein_coding N/A Carbohydrate/purine kinase domain containing protein. (Os01t0105900-01) N/A N/A N/A Os01g0105900 gene:Os01g0105900 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 309520 313170 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0105900-01 N/A N/A gene:Os01g0105900 N/A N/A Os01t0105900-01
|
||||
1 irgsp exon 309520 310070 . -1 . N/A N/A 1 N/A -1 0 Os01t0105900-01.exon8 N/A Os01t0105900-01.exon8 N/A Os01t0105900-01.exon8 transcript:Os01t0105900-01 N/A 8 N/A
|
||||
1 irgsp exon 310256 310367 . -1 . N/A N/A 1 N/A 0 2 Os01t0105900-01.exon7 N/A Os01t0105900-01.exon7 N/A Os01t0105900-01.exon7 transcript:Os01t0105900-01 N/A 7 N/A
|
||||
1 irgsp exon 310455 310552 . -1 . N/A N/A 1 N/A 2 0 Os01t0105900-01.exon6 N/A Os01t0105900-01.exon6 N/A Os01t0105900-01.exon6 transcript:Os01t0105900-01 N/A 6 N/A
|
||||
1 irgsp exon 310632 310739 . -1 . N/A N/A 1 N/A 0 0 Os01t0105900-01.exon5 N/A Os01t0105900-01.exon5 N/A Os01t0105900-01.exon5 transcript:Os01t0105900-01 N/A 5 N/A
|
||||
1 irgsp exon 310880 310918 . -1 . N/A N/A 1 N/A 0 0 Os01t0105900-01.exon4 N/A Os01t0105900-01.exon4 N/A Os01t0105900-01.exon4 transcript:Os01t0105900-01 N/A 4 N/A
|
||||
1 irgsp exon 311002 311073 . -1 . N/A N/A 1 N/A 0 0 Os01t0105900-01.exon3 N/A Os01t0105900-01.exon3 N/A Os01t0105900-01.exon3 transcript:Os01t0105900-01 N/A 3 N/A
|
||||
1 irgsp exon 311163 311426 . -1 . N/A N/A 1 N/A 0 0 Os01t0105900-01.exon2 N/A Os01t0105900-01.exon2 N/A Os01t0105900-01.exon2 transcript:Os01t0105900-01 N/A 2 N/A
|
||||
1 irgsp exon 312867 313170 . -1 . N/A N/A 1 N/A 0 -1 Os01t0105900-01.exon1 N/A Os01t0105900-01.exon1 N/A Os01t0105900-01.exon1 transcript:Os01t0105900-01 N/A 1 N/A
|
||||
1 irgsp CDS 309822 310070 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105900-01 N/A N/A transcript:Os01t0105900-01 Os01t0105900-01 N/A N/A
|
||||
1 irgsp CDS 310256 310367 . -1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105900-01 N/A N/A transcript:Os01t0105900-01 Os01t0105900-01 N/A N/A
|
||||
1 irgsp CDS 310455 310552 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105900-01 N/A N/A transcript:Os01t0105900-01 Os01t0105900-01 N/A N/A
|
||||
1 irgsp CDS 310632 310739 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105900-01 N/A N/A transcript:Os01t0105900-01 Os01t0105900-01 N/A N/A
|
||||
1 irgsp CDS 310880 310918 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105900-01 N/A N/A transcript:Os01t0105900-01 Os01t0105900-01 N/A N/A
|
||||
1 irgsp CDS 311002 311073 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105900-01 N/A N/A transcript:Os01t0105900-01 Os01t0105900-01 N/A N/A
|
||||
1 irgsp CDS 311163 311426 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105900-01 N/A N/A transcript:Os01t0105900-01 Os01t0105900-01 N/A N/A
|
||||
1 irgsp CDS 312867 313064 . -1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0105900-01 N/A N/A transcript:Os01t0105900-01 Os01t0105900-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 313065 313170 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-78 N/A N/A transcript:Os01t0105900-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 309520 309821 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-68 N/A N/A transcript:Os01t0105900-01 N/A N/A N/A
|
||||
1 irgsp gene 319754 322205 . 1 . N/A protein_coding N/A Similar to RER1A protein (AtRER1A). (Os01t0106200-01) N/A N/A N/A Os01g0106200 gene:Os01g0106200 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 319754 322205 . 1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0106200-01 N/A N/A gene:Os01g0106200 N/A N/A Os01t0106200-01
|
||||
1 irgsp exon 319754 320236 . 1 . N/A N/A 1 N/A 2 -1 Os01t0106200-01.exon1 N/A Os01t0106200-01.exon1 N/A Os01t0106200-01.exon1 transcript:Os01t0106200-01 N/A 1 N/A
|
||||
1 irgsp exon 321468 321648 . 1 . N/A N/A 1 N/A 0 2 Os01t0106200-01.exon2 N/A Os01t0106200-01.exon2 N/A Os01t0106200-01.exon2 transcript:Os01t0106200-01 N/A 2 N/A
|
||||
1 irgsp exon 321928 322205 . 1 . N/A N/A 1 N/A -1 0 Os01t0106200-01.exon3 N/A Os01t0106200-01.exon3 N/A Os01t0106200-01.exon3 transcript:Os01t0106200-01 N/A 3 N/A
|
||||
1 irgsp CDS 319875 320236 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0106200-01 N/A N/A transcript:Os01t0106200-01 Os01t0106200-01 N/A N/A
|
||||
1 irgsp CDS 321468 321648 . 1 1 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0106200-01 N/A N/A transcript:Os01t0106200-01 Os01t0106200-01 N/A N/A
|
||||
1 irgsp CDS 321928 321975 . 1 0 N/A N/A N/A N/A N/A N/A N/A N/A CDS:Os01t0106200-01 N/A N/A transcript:Os01t0106200-01 Os01t0106200-01 N/A N/A
|
||||
1 irgsp five_prime_UTR 319754 319874 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-five_prime_utr-79 N/A N/A transcript:Os01t0106200-01 N/A N/A N/A
|
||||
1 irgsp three_prime_UTR 321976 322205 . 1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-69 N/A N/A transcript:Os01t0106200-01 N/A N/A N/A
|
||||
1 irgsp gene 322591 323923 . -1 . N/A protein_coding N/A Similar to Isoflavone reductase homolog IRL (EC 1.3.1.-). (Os01t0106300-01) N/A N/A N/A Os01g0106300 gene:Os01g0106300 irgspv1.0-20170804-genes N/A N/A N/A N/A N/A
|
||||
1 irgsp mRNA 322591 323923 . -1 . N/A protein_coding N/A N/A N/A N/A N/A N/A transcript:Os01t0106300-01 N/A N/A gene:Os01g0106300 N/A N/A Os01t0106300-01
|
||||
1 irgsp exon 322591 323923 . -1 . N/A N/A 1 N/A -1 1 Os01t0106300-01.exon2 N/A Os01t0106300-01.exon2 N/A Os01t0106300-01.exon2 transcript:Os01t0106300-01 N/A 2 N/A
|
||||
1 irgsp three_prime_UTR 322591 322809 . -1 . N/A N/A N/A N/A N/A N/A N/A N/A agat-three_prime_utr-70 N/A N/A transcript:Os01t0106300-01 N/A N/A N/A
|
||||
|
10
src/agat/agat_convert_sp_gff2tsv/test_data/script.sh
Executable file
10
src/agat/agat_convert_sp_gff2tsv/test_data/script.sh
Executable file
@@ -0,0 +1,10 @@
|
||||
#!/bin/bash
|
||||
|
||||
# clone repo
|
||||
if [ ! -d /tmp/agat_source ]; then
|
||||
git clone --depth 1 --single-branch --branch master https://github.com/NBISweden/AGAT /tmp/agat_source
|
||||
fi
|
||||
|
||||
# copy test data
|
||||
cp -r /tmp/agat_source/t/scripts_output/out/agat_convert_sp_gff2tsv_1.tsv src/agat/agat_convert_sp_gff2tsv/test_data
|
||||
cp -r /tmp/agat_source/t/scripts_output/in/1.gff src/agat/agat_convert_sp_gff2tsv/test_data
|
||||
388
src/arriba/config.vsh.yaml
Normal file
388
src/arriba/config.vsh.yaml
Normal file
@@ -0,0 +1,388 @@
|
||||
name: arriba
|
||||
description: Detect gene fusions from RNA-Seq data
|
||||
keywords: [Gene fusion, RNA-Seq]
|
||||
links:
|
||||
homepage: https://arriba.readthedocs.io/en/latest/
|
||||
documentation: https://arriba.readthedocs.io/en/latest/
|
||||
repository: https://github.com/suhrig/arriba
|
||||
references:
|
||||
doi: 10.1101/gr.257246.119
|
||||
license: MIT
|
||||
requirements:
|
||||
cpus: 1
|
||||
commands: [ arriba ]
|
||||
authors:
|
||||
- __merge__: /src/_authors/robrecht_cannoodt.yaml
|
||||
roles: [ author, maintainer ]
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments:
|
||||
- name: --bam
|
||||
alternatives: -x
|
||||
type: file
|
||||
description: |
|
||||
File in SAM/BAM/CRAM format with main alignments as generated by STAR
|
||||
(Aligned.out.sam). Arriba extracts candidate reads from this file.
|
||||
required: true
|
||||
example: Aligned.out.bam
|
||||
- name: --genome
|
||||
alternatives: -a
|
||||
type: file
|
||||
description: |
|
||||
FastA file with genome sequence (assembly). The file may be gzip-compressed. An
|
||||
index with the file extension .fai must exist only if CRAM files are processed.
|
||||
required: true
|
||||
example: assembly.fa
|
||||
- name: --gene_annotation
|
||||
alternatives: -g
|
||||
type: file
|
||||
description: |
|
||||
GTF file with gene annotation. The file may be gzip-compressed.
|
||||
required: true
|
||||
example: annotation.gtf
|
||||
- name: --known_fusions
|
||||
alternatives: -k
|
||||
type: file
|
||||
description: |
|
||||
File containing known/recurrent fusions. Some cancer entities are often
|
||||
characterized by fusions between the same pair of genes. In order to boost
|
||||
sensitivity, a list of known fusions can be supplied using this parameter. The list
|
||||
must contain two columns with the names of the fused genes, separated by tabs.
|
||||
required: false
|
||||
example: known_fusions.tsv
|
||||
- name: --blacklist
|
||||
alternatives: -b
|
||||
type: file
|
||||
description: |
|
||||
File containing blacklisted events (recurrent artifacts and transcripts
|
||||
observed in healthy tissue).
|
||||
required: false
|
||||
example: blacklist.tsv
|
||||
- name: --structural_variants
|
||||
alternatives: -d
|
||||
type: file
|
||||
description: |
|
||||
Tab-separated file with coordinates of structural variants found using
|
||||
whole-genome sequencing data. These coordinates serve to increase sensitivity
|
||||
towards weakly expressed fusions and to eliminate fusions with low evidence.
|
||||
required: false
|
||||
example: structural_variants_from_WGS.tsv
|
||||
- name: --tags
|
||||
alternatives: -t
|
||||
type: file
|
||||
description: |
|
||||
Tab-separated file containing fusions to annotate with tags in the 'tags' column.
|
||||
The first two columns specify the genes; the third column specifies the tag. The
|
||||
file may be gzip-compressed.
|
||||
required: false
|
||||
example: tags.tsv
|
||||
- name: --protein_domains
|
||||
alternatives: -p
|
||||
type: file
|
||||
description: |
|
||||
File in GFF3 format containing coordinates of the protein domains of genes. The
|
||||
protein domains retained in a fusion are listed in the column
|
||||
'retained_protein_domains'. The file may be gzip-compressed.
|
||||
required: false
|
||||
example: protein_domains.gff3
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- name: --fusions
|
||||
alternatives: -o
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Output file with fusions that have passed all filters.
|
||||
required: true
|
||||
example: fusions.tsv
|
||||
- name: --fusions_discarded
|
||||
alternatives: -O
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Output file with fusions that were discarded due to filtering.
|
||||
required: false
|
||||
example: fusions.discarded.tsv
|
||||
- name: Arguments
|
||||
arguments:
|
||||
- name: --max_genomic_breakpoint_distance
|
||||
alternatives: -D
|
||||
type: long
|
||||
description: |
|
||||
When a file with genomic breakpoints obtained via
|
||||
whole-genome sequencing is supplied via the --structural_variants
|
||||
parameter, this parameter determines how far a
|
||||
genomic breakpoint may be away from a
|
||||
transcriptomic breakpoint to consider it as a
|
||||
related event. For events inside genes, the
|
||||
distance is added to the end of the gene; for
|
||||
intergenic events, the distance threshold is
|
||||
applied as is. Default: 100000.
|
||||
required: false
|
||||
- name: --strandedness
|
||||
alternatives: -s
|
||||
type: string
|
||||
description: |
|
||||
Whether a strand-specific protocol was used for library preparation,
|
||||
and if so, the type of strandedness (auto/yes/no/reverse). When
|
||||
unstranded data is processed, the strand can sometimes be inferred from
|
||||
splice-patterns. But in unclear situations, stranded data helps
|
||||
resolve ambiguities. Default: auto
|
||||
choices: ["auto", "yes", "no", "reverse"]
|
||||
required: false
|
||||
- name: --interesting_contigs
|
||||
alternatives: -i
|
||||
type: string
|
||||
description: |
|
||||
List of interesting contigs. Fusions between genes
|
||||
on other contigs are ignored. Contigs can be specified with or without the
|
||||
prefix "chr". Asterisks (*) are treated as wild-cards.
|
||||
Default: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y AC_* NC_*
|
||||
required: false
|
||||
multiple: true
|
||||
example: ["1", "2", "AC_*", "NC_*"]
|
||||
- name: --viral_contigs
|
||||
alternatives: -v
|
||||
type: string
|
||||
description: |
|
||||
List of viral contigs. Asterisks (*) are treated as
|
||||
wild-cards.
|
||||
Default: AC_* NC_*
|
||||
required: false
|
||||
multiple: true
|
||||
example: ["AC_*", "NC_*"]
|
||||
- name: --disable_filters
|
||||
alternatives: -f
|
||||
type: string
|
||||
description: |
|
||||
List of filters to disable. By default all filters are
|
||||
enabled.
|
||||
choices: [ homologs, low_entropy, isoforms,
|
||||
top_expressed_viral_contigs, viral_contigs, uninteresting_contigs,
|
||||
non_coding_neighbors, mismatches, duplicates, no_genomic_support,
|
||||
genomic_support, intronic, end_to_end, relative_support,
|
||||
low_coverage_viral_contigs, merge_adjacent, mismappers, multimappers,
|
||||
same_gene, long_gap, internal_tandem_duplication, small_insert_size,
|
||||
read_through, inconsistently_clipped, intragenic_exonic,
|
||||
marginal_read_through, spliced, hairpin, blacklist, min_support,
|
||||
select_best, in_vitro, short_anchor, known_fusions, no_coverage,
|
||||
homopolymer, many_spliced ]
|
||||
required: false
|
||||
multiple: true
|
||||
- name: --max_e_value
|
||||
alternatives: -E
|
||||
type: double
|
||||
description: |
|
||||
Arriba estimates the number of fusions with a given number of supporting
|
||||
reads which one would expect to see by random chance. If the expected number
|
||||
of fusions (e-value) is higher than this threshold, the fusion is
|
||||
discarded by the 'relative_support' filter. Note: Increasing this
|
||||
threshold can dramatically increase the number of false positives and may
|
||||
increase the runtime of resource-intensive steps. Fractional values are
|
||||
possible. Default: 0.300000
|
||||
required: false
|
||||
- name: --min_supporting_reads
|
||||
alternatives: -S
|
||||
type: integer
|
||||
description: |
|
||||
The 'min_support' filter discards all fusions with fewer than
|
||||
this many supporting reads (split reads and discordant mates
|
||||
combined). Default: 2
|
||||
required: false
|
||||
example: 2
|
||||
- name: --max_mismappers
|
||||
alternatives: -m
|
||||
type: double
|
||||
description: |
|
||||
When more than this fraction of supporting reads turns out to be
|
||||
mismappers, the 'mismappers' filter discards the fusion. Default:
|
||||
0.800000
|
||||
required: false
|
||||
example: 0.8
|
||||
- name: --max_homolog_identity
|
||||
alternatives: -L
|
||||
type: double
|
||||
description: |
|
||||
Genes with more than the given fraction of sequence identity are
|
||||
considered homologs and removed by the 'homologs' filter.
|
||||
Default: 0.300000
|
||||
required: false
|
||||
example: 0.3
|
||||
- name: --homopolymer_length
|
||||
alternatives: -H
|
||||
type: integer
|
||||
description: |
|
||||
The 'homopolymer' filter removes breakpoints adjacent to
|
||||
homopolymers of the given length or more. Default: 6
|
||||
required: false
|
||||
example: 6
|
||||
- name: --read_through_distance
|
||||
alternatives: -R
|
||||
type: integer
|
||||
description: |
|
||||
The 'read_through' filter removes read-through fusions
|
||||
where the breakpoints are less than the given distance away
|
||||
from each other. Default: 10000
|
||||
required: false
|
||||
example: 10000
|
||||
- name : --min_anchor_length
|
||||
alternatives: -A
|
||||
type: integer
|
||||
description: |
|
||||
Alignment artifacts are often characterized by split reads coming
|
||||
from only one gene and no discordant mates. Moreover, the split
|
||||
reads only align to a short stretch in one of the genes. The
|
||||
'short_anchor' filter removes these fusions. This parameter sets
|
||||
the threshold in bp for what the filter considers short. Default: 23
|
||||
required: false
|
||||
example: 23
|
||||
- name: --many_spliced_events
|
||||
alternatives: -M
|
||||
type: integer
|
||||
description: |
|
||||
The 'many_spliced' filter recovers fusions between genes that
|
||||
have at least this many spliced breakpoints. Default: 4
|
||||
required: false
|
||||
example: 4
|
||||
- name: --max_kmer_content
|
||||
alternatives: -K
|
||||
type: double
|
||||
description: |
|
||||
The 'low_entropy' filter removes reads with repetitive 3-mers. If
|
||||
the 3-mers make up more than the given fraction of the sequence, then
|
||||
the read is discarded. Default: 0.600000
|
||||
required: false
|
||||
example: 0.6
|
||||
- name: --max_mismatch_pvalue
|
||||
alternatives: -V
|
||||
type: double
|
||||
description: |
|
||||
The 'mismatches' filter uses a binomial model to calculate a
|
||||
p-value for observing a given number of mismatches in a read. If
|
||||
the number of mismatches is too high, the read is discarded.
|
||||
Default: 0.010000
|
||||
required: false
|
||||
example: 0.05
|
||||
- name: --fragment_length
|
||||
alternatives: -F
|
||||
type: integer
|
||||
description: |
|
||||
When paired-end data is given, the fragment length is estimated
|
||||
automatically and this parameter has no effect. But when single-end
|
||||
data is given, the mean fragment length should be specified to
|
||||
effectively filter fusions that arise from hairpin structures.
|
||||
Default: 200
|
||||
required: false
|
||||
example: 200
|
||||
- name: --max_reads
|
||||
alternatives: -U
|
||||
type: integer
|
||||
description: |
|
||||
Subsample fusions with more than the given number of supporting reads. This
|
||||
improves performance without compromising sensitivity, as long as the
|
||||
threshold is high. Counting of supporting reads beyond the threshold is
|
||||
inaccurate, obviously. Default: 300
|
||||
required: false
|
||||
example: 300
|
||||
- name: --quantile
|
||||
alternatives: -Q
|
||||
type: double
|
||||
description: |
|
||||
Highly expressed genes are prone to produce artifacts during library
|
||||
preparation. Genes with an expression above the given quantile are eligible
|
||||
for filtering by the 'in_vitro' filter. Default: 0.998000
|
||||
required: false
|
||||
example: 0.998
|
||||
- name: --exonic_fraction
|
||||
alternatives: -e
|
||||
type: double
|
||||
description: |
|
||||
The breakpoints of false-positive predictions of intragenic events
|
||||
are often both in exons. True predictions are more likely to have at
|
||||
least one breakpoint in an intron, because introns are larger. If the
|
||||
fraction of exonic sequence between two breakpoints is smaller than
|
||||
the given fraction, the 'intragenic_exonic' filter discards the
|
||||
event. Default: 0.330000
|
||||
required: false
|
||||
example: 0.33
|
||||
- name: --top_n
|
||||
alternatives: -T
|
||||
type: integer
|
||||
description: |
|
||||
Only report viral integration sites of the top N most highly expressed viral
|
||||
contigs. Default: 5
|
||||
required: false
|
||||
example: 5
|
||||
- name: --covered_fraction
|
||||
alternatives: -C
|
||||
type: double
|
||||
description: |
|
||||
Ignore virally associated events if the virus is not fully
|
||||
expressed, i.e., less than the given fraction of the viral contig is
|
||||
transcribed. Default: 0.050000
|
||||
required: false
|
||||
example: 0.05
|
||||
- name: --max_itd_length
|
||||
alternatives: -l
|
||||
type: integer
|
||||
description: |
|
||||
Maximum length of internal tandem duplications. Note: Increasing
|
||||
this value beyond the default can impair performance and lead to many
|
||||
false positives. Default: 100
|
||||
required: false
|
||||
example: 100
|
||||
- name: --min_itd_allele_fraction
|
||||
alternatives: -z
|
||||
type: double
|
||||
description: |
|
||||
Required fraction of supporting reads to report an internal
|
||||
tandem duplication. Default: 0.070000
|
||||
required: false
|
||||
example: 0.07
|
||||
- name: --min_itd_supporting_reads
|
||||
alternatives: -Z
|
||||
type: integer
|
||||
description: |
|
||||
Required absolute number of supporting reads to report an
|
||||
internal tandem duplication. Default: 10
|
||||
required: false
|
||||
example: 10
|
||||
- name: --skip_duplicate_marking
|
||||
alternatives: -u
|
||||
type: boolean_true
|
||||
description: |
|
||||
Instead of performing duplicate marking itself, Arriba relies on duplicate marking by a
|
||||
preceding program using the BAM_FDUP flag. This makes sense when unique molecular
|
||||
identifiers (UMI) are used.
|
||||
- name: --extra_information
|
||||
alternatives: -X
|
||||
type: boolean_true
|
||||
description: |
|
||||
To reduce the runtime and file size, by default, the columns 'fusion_transcript',
|
||||
'peptide_sequence', and 'read_identifiers' are left empty in the file containing
|
||||
discarded fusion candidates (see parameter -O). When this flag is set, this extra
|
||||
information is reported in the discarded fusions file.
|
||||
- name: --fill_gaps
|
||||
alternatives: -I
|
||||
type: boolean_true
|
||||
description: |
|
||||
If assembly of the fusion transcript sequence from the supporting reads is incomplete
|
||||
(denoted as '...'), fill the gaps using the assembly sequence wherever possible.
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- type: file
|
||||
path: test_data
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/arriba:2.4.0--h0033a41_2
|
||||
setup:
|
||||
- type: docker
|
||||
run: |
|
||||
arriba -h | grep 'Version:' 2>&1 | sed 's/Version:\s\(.*\)/arriba: "\1"/' > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
198
src/arriba/help.txt
Normal file
198
src/arriba/help.txt
Normal file
@@ -0,0 +1,198 @@
|
||||
```bash
|
||||
arriba -h
|
||||
```
|
||||
|
||||
Arriba gene fusion detector
|
||||
---------------------------
|
||||
Version: 2.4.0
|
||||
|
||||
Arriba is a fast tool to search for aberrant transcripts such as gene fusions.
|
||||
It is based on chimeric alignments found by the STAR RNA-Seq aligner.
|
||||
|
||||
Usage: arriba [-c Chimeric.out.sam] -x Aligned.out.bam \
|
||||
-g annotation.gtf -a assembly.fa [-b blacklists.tsv] [-k known_fusions.tsv] \
|
||||
[-t tags.tsv] [-p protein_domains.gff3] [-d structural_variants_from_WGS.tsv] \
|
||||
-o fusions.tsv [-O fusions.discarded.tsv] \
|
||||
[OPTIONS]
|
||||
|
||||
-c FILE File in SAM/BAM/CRAM format with chimeric alignments as generated by STAR
|
||||
(Chimeric.out.sam). This parameter is only required, if STAR was run with the
|
||||
parameter '--chimOutType SeparateSAMold'. When STAR was run with the parameter
|
||||
'--chimOutType WithinBAM', it suffices to pass the parameter -x to Arriba and -c
|
||||
can be omitted.
|
||||
|
||||
-x FILE File in SAM/BAM/CRAM format with main alignments as generated by STAR
|
||||
(Aligned.out.sam). Arriba extracts candidate reads from this file.
|
||||
|
||||
-g FILE GTF file with gene annotation. The file may be gzip-compressed.
|
||||
|
||||
-G GTF_FEATURES Comma-/space-separated list of names of GTF features.
|
||||
Default: gene_name=gene_name|gene_id gene_id=gene_id
|
||||
transcript_id=transcript_id feature_exon=exon feature_CDS=CDS
|
||||
|
||||
-a FILE FastA file with genome sequence (assembly). The file may be gzip-compressed. An
|
||||
index with the file extension .fai must exist only if CRAM files are processed.
|
||||
|
||||
-b FILE File containing blacklisted events (recurrent artifacts and transcripts
|
||||
observed in healthy tissue).
|
||||
|
||||
-k FILE File containing known/recurrent fusions. Some cancer entities are often
|
||||
characterized by fusions between the same pair of genes. In order to boost
|
||||
sensitivity, a list of known fusions can be supplied using this parameter. The list
|
||||
must contain two columns with the names of the fused genes, separated by tabs.
|
||||
|
||||
-o FILE Output file with fusions that have passed all filters.
|
||||
|
||||
-O FILE Output file with fusions that were discarded due to filtering.
|
||||
|
||||
-t FILE Tab-separated file containing fusions to annotate with tags in the 'tags' column.
|
||||
The first two columns specify the genes; the third column specifies the tag. The
|
||||
file may be gzip-compressed.
|
||||
|
||||
-p FILE File in GFF3 format containing coordinates of the protein domains of genes. The
|
||||
protein domains retained in a fusion are listed in the column
|
||||
'retained_protein_domains'. The file may be gzip-compressed.
|
||||
|
||||
-d FILE Tab-separated file with coordinates of structural variants found using
|
||||
whole-genome sequencing data. These coordinates serve to increase sensitivity
|
||||
towards weakly expressed fusions and to eliminate fusions with low evidence.
|
||||
|
||||
-D MAX_GENOMIC_BREAKPOINT_DISTANCE When a file with genomic breakpoints obtained via
|
||||
whole-genome sequencing is supplied via the -d
|
||||
parameter, this parameter determines how far a
|
||||
genomic breakpoint may be away from a
|
||||
transcriptomic breakpoint to consider it as a
|
||||
related event. For events inside genes, the
|
||||
distance is added to the end of the gene; for
|
||||
intergenic events, the distance threshold is
|
||||
applied as is. Default: 100000
|
||||
|
||||
-s STRANDEDNESS Whether a strand-specific protocol was used for library preparation,
|
||||
and if so, the type of strandedness (auto/yes/no/reverse). When
|
||||
unstranded data is processed, the strand can sometimes be inferred from
|
||||
splice-patterns. But in unclear situations, stranded data helps
|
||||
resolve ambiguities. Default: auto
|
||||
|
||||
-i CONTIGS Comma-/space-separated list of interesting contigs. Fusions between genes
|
||||
on other contigs are ignored. Cfontigs can be specified with or without the
|
||||
prefix "chr". Asterisks (*) are treated as wild-cards.
|
||||
Default: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y AC_* NC_*
|
||||
|
||||
-v CONTIGS Comma-/space-separated list of viral contigs. Asterisks (*) are treated as
|
||||
wild-cards.
|
||||
Default: AC_* NC_*
|
||||
|
||||
-f FILTERS Comma-/space-separated list of filters to disable. By default all filters are
|
||||
enabled. Valid values: homologs, low_entropy, isoforms,
|
||||
top_expressed_viral_contigs, viral_contigs, uninteresting_contigs,
|
||||
non_coding_neighbors, mismatches, duplicates, no_genomic_support,
|
||||
genomic_support, intronic, end_to_end, relative_support,
|
||||
low_coverage_viral_contigs, merge_adjacent, mismappers, multimappers,
|
||||
same_gene, long_gap, internal_tandem_duplication, small_insert_size,
|
||||
read_through, inconsistently_clipped, intragenic_exonic,
|
||||
marginal_read_through, spliced, hairpin, blacklist, min_support,
|
||||
select_best, in_vitro, short_anchor, known_fusions, no_coverage,
|
||||
homopolymer, many_spliced
|
||||
|
||||
-E MAX_E-VALUE Arriba estimates the number of fusions with a given number of supporting
|
||||
reads which one would expect to see by random chance. If the expected number
|
||||
of fusions (e-value) is higher than this threshold, the fusion is
|
||||
discarded by the 'relative_support' filter. Note: Increasing this
|
||||
threshold can dramatically increase the number of false positives and may
|
||||
increase the runtime of resource-intensive steps. Fractional values are
|
||||
possible. Default: 0.300000
|
||||
|
||||
-S MIN_SUPPORTING_READS The 'min_support' filter discards all fusions with fewer than
|
||||
this many supporting reads (split reads and discordant mates
|
||||
combined). Default: 2
|
||||
|
||||
-m MAX_MISMAPPERS When more than this fraction of supporting reads turns out to be
|
||||
mismappers, the 'mismappers' filter discards the fusion. Default:
|
||||
0.800000
|
||||
|
||||
-L MAX_HOMOLOG_IDENTITY Genes with more than the given fraction of sequence identity are
|
||||
considered homologs and removed by the 'homologs' filter.
|
||||
Default: 0.300000
|
||||
|
||||
-H HOMOPOLYMER_LENGTH The 'homopolymer' filter removes breakpoints adjacent to
|
||||
homopolymers of the given length or more. Default: 6
|
||||
|
||||
-R READ_THROUGH_DISTANCE The 'read_through' filter removes read-through fusions
|
||||
where the breakpoints are less than the given distance away
|
||||
from each other. Default: 10000
|
||||
|
||||
-A MIN_ANCHOR_LENGTH Alignment artifacts are often characterized by split reads coming
|
||||
from only one gene and no discordant mates. Moreover, the split
|
||||
reads only align to a short stretch in one of the genes. The
|
||||
'short_anchor' filter removes these fusions. This parameter sets
|
||||
the threshold in bp for what the filter considers short. Default: 23
|
||||
|
||||
-M MANY_SPLICED_EVENTS The 'many_spliced' filter recovers fusions between genes that
|
||||
have at least this many spliced breakpoints. Default: 4
|
||||
|
||||
-K MAX_KMER_CONTENT The 'low_entropy' filter removes reads with repetitive 3-mers. If
|
||||
the 3-mers make up more than the given fraction of the sequence, then
|
||||
the read is discarded. Default: 0.600000
|
||||
|
||||
-V MAX_MISMATCH_PVALUE The 'mismatches' filter uses a binomial model to calculate a
|
||||
p-value for observing a given number of mismatches in a read. If
|
||||
the number of mismatches is too high, the read is discarded.
|
||||
Default: 0.010000
|
||||
|
||||
-F FRAGMENT_LENGTH When paired-end data is given, the fragment length is estimated
|
||||
automatically and this parameter has no effect. But when single-end
|
||||
data is given, the mean fragment length should be specified to
|
||||
effectively filter fusions that arise from hairpin structures.
|
||||
Default: 200
|
||||
|
||||
-U MAX_READS Subsample fusions with more than the given number of supporting reads. This
|
||||
improves performance without compromising sensitivity, as long as the
|
||||
threshold is high. Counting of supporting reads beyond the threshold is
|
||||
inaccurate, obviously. Default: 300
|
||||
|
||||
-Q QUANTILE Highly expressed genes are prone to produce artifacts during library
|
||||
preparation. Genes with an expression above the given quantile are eligible
|
||||
for filtering by the 'in_vitro' filter. Default: 0.998000
|
||||
|
||||
-e EXONIC_FRACTION The breakpoints of false-positive predictions of intragenic events
|
||||
are often both in exons. True predictions are more likely to have at
|
||||
least one breakpoint in an intron, because introns are larger. If the
|
||||
fraction of exonic sequence between two breakpoints is smaller than
|
||||
the given fraction, the 'intragenic_exonic' filter discards the
|
||||
event. Default: 0.330000
|
||||
|
||||
-T TOP_N Only report viral integration sites of the top N most highly expressed viral
|
||||
contigs. Default: 5
|
||||
|
||||
-C COVERED_FRACTION Ignore virally associated events if the virus is not fully
|
||||
expressed, i.e., less than the given fraction of the viral contig is
|
||||
transcribed. Default: 0.050000
|
||||
|
||||
-l MAX_ITD_LENGTH Maximum length of internal tandem duplications. Note: Increasing
|
||||
this value beyond the default can impair performance and lead to many
|
||||
false positives. Default: 100
|
||||
|
||||
-z MIN_ITD_ALLELE_FRACTION Required fraction of supporting reads to report an internal
|
||||
tandem duplication. Default: 0.070000
|
||||
|
||||
-Z MIN_ITD_SUPPORTING_READS Required absolute number of supporting reads to report an
|
||||
internal tandem duplication. Default: 10
|
||||
|
||||
-u Instead of performing duplicate marking itself, Arriba relies on duplicate marking by a
|
||||
preceding program using the BAM_FDUP flag. This makes sense when unique molecular
|
||||
identifiers (UMI) are used.
|
||||
|
||||
-X To reduce the runtime and file size, by default, the columns 'fusion_transcript',
|
||||
'peptide_sequence', and 'read_identifiers' are left empty in the file containing
|
||||
discarded fusion candidates (see parameter -O). When this flag is set, this extra
|
||||
information is reported in the discarded fusions file.
|
||||
|
||||
-I If assembly of the fusion transcript sequence from the supporting reads is incomplete
|
||||
(denoted as '...'), fill the gaps using the assembly sequence wherever possible.
|
||||
|
||||
-h Print help and exit.
|
||||
|
||||
Code repository: https://github.com/suhrig/arriba
|
||||
Get help/report bugs: https://github.com/suhrig/arriba/issues
|
||||
User manual: https://arriba.readthedocs.io/
|
||||
Please cite: https://doi.org/10.1101/gr.257246.119
|
||||
54
src/arriba/script.sh
Normal file
54
src/arriba/script.sh
Normal file
@@ -0,0 +1,54 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
# unset flags
|
||||
[[ "$par_skip_duplicate_marking" == "false" ]] && unset par_skip_duplicate_marking
|
||||
[[ "$par_extra_information" == "false" ]] && unset par_extra_information
|
||||
[[ "$par_fill_gaps" == "false" ]] && unset par_fill_gaps
|
||||
|
||||
# replace ';' with ','
|
||||
par_interesting_contigs=$(echo $par_interesting_contigs | tr ';' ',')
|
||||
par_viral_contigs=$(echo $par_viral_contigs | tr ';' ',')
|
||||
par_disable_filters=$(echo $par_disable_filters | tr ';' ',')
|
||||
|
||||
# run arriba
|
||||
arriba \
|
||||
-x "$par_bam" \
|
||||
-a "$par_genome" \
|
||||
-g "$par_gene_annotation" \
|
||||
-o "$par_fusions" \
|
||||
${par_known_fusions:+-k "${par_known_fusions}"} \
|
||||
${par_blacklist:+-b "${par_blacklist}"} \
|
||||
${par_structural_variants:+-d "${par_structural_variants}"} \
|
||||
${par_tags:+-t "${par_tags}"} \
|
||||
${par_protein_domains:+-p "${par_protein_domains}"} \
|
||||
${par_fusions_discarded:+-O "${par_fusions_discarded}"} \
|
||||
${par_max_genomic_breakpoint_distance:+-D "${par_max_genomic_breakpoint_distance}"} \
|
||||
${par_strandedness:+-s "${par_strandedness}"} \
|
||||
${par_interesting_contigs:+-i "${par_interesting_contigs}"} \
|
||||
${par_viral_contigs:+-v "${par_viral_contigs}"} \
|
||||
${par_disable_filters:+-f "${par_disable_filters}"} \
|
||||
${par_max_e_value:+-E "${par_max_e_value}"} \
|
||||
${par_min_supporting_reads:+-S "${par_min_supporting_reads}"} \
|
||||
${par_max_mismappers:+-m "${par_max_mismappers}"} \
|
||||
${par_max_homolog_identity:+-L "${par_max_homolog_identity}"} \
|
||||
${par_homopolymer_length:+-H "${par_homopolymer_length}"} \
|
||||
${par_read_through_distance:+-R "${par_read_through_distance}"} \
|
||||
${par_min_anchor_length:+-A "${par_min_anchor_length}"} \
|
||||
${par_many_spliced_events:+-M "${par_many_spliced_events}"} \
|
||||
${par_max_kmer_content:+-K "${par_max_kmer_content}"} \
|
||||
${par_max_mismatch_pvalue:+-V "${par_max_mismatch_pvalue}"} \
|
||||
${par_fragment_length:+-F "${par_fragment_length}"} \
|
||||
${par_max_reads:+-U "${par_max_reads}"} \
|
||||
${par_quantile:+-Q "${par_quantile}"} \
|
||||
${par_exonic_fraction:+-e "${par_exonic_fraction}"} \
|
||||
${par_top_n:+-T "${par_top_n}"} \
|
||||
${par_covered_fraction:+-C "${par_covered_fraction}"} \
|
||||
${par_max_itd_length:+-l "${par_max_itd_length}"} \
|
||||
${par_min_itd_allele_fraction:+-z "${par_min_itd_allele_fraction}"} \
|
||||
${par_min_itd_supporting_reads:+-Z "${par_min_itd_supporting_reads}"} \
|
||||
${par_skip_duplicate_marking:+-u} \
|
||||
${par_extra_information:+-X} \
|
||||
${par_fill_gaps:+-I}
|
||||
45
src/arriba/test.sh
Normal file
45
src/arriba/test.sh
Normal file
@@ -0,0 +1,45 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
dir_in="$meta_resources_dir/test_data"
|
||||
|
||||
echo "> Run arriba with blacklist"
|
||||
"$meta_executable" \
|
||||
--bam "$dir_in/A.bam" \
|
||||
--genome "$dir_in/genome.fasta" \
|
||||
--gene_annotation "$dir_in/annotation.gtf" \
|
||||
--blacklist "$dir_in/blacklist.tsv" \
|
||||
--fusions "fusions.tsv" \
|
||||
--fusions_discarded "fusions_discarded.tsv" \
|
||||
--interesting_contigs "1,2"
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "fusions.tsv" ] && echo "Output file fusions.tsv does not exist" && exit 1
|
||||
[ ! -f "fusions_discarded.tsv" ] && echo "Output file fusions_discarded.tsv does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "fusions.tsv" ] && echo "Output file fusions.tsv is empty" && exit 1
|
||||
[ ! -s "fusions_discarded.tsv" ] && echo "Output file fusions_discarded.tsv is empty" && exit 1
|
||||
|
||||
rm fusions.tsv fusions_discarded.tsv
|
||||
|
||||
echo "> Run arriba without blacklist"
|
||||
"$meta_executable" \
|
||||
--bam "$dir_in/A.bam" \
|
||||
--genome "$dir_in/genome.fasta" \
|
||||
--gene_annotation "$dir_in/annotation.gtf" \
|
||||
--fusions "fusions.tsv" \
|
||||
--fusions_discarded "fusions_discarded.tsv" \
|
||||
--interesting_contigs "1,2" \
|
||||
--disable_filters blacklist
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "fusions.tsv" ] && echo "Output file fusions.tsv does not exist" && exit 1
|
||||
[ ! -f "fusions_discarded.tsv" ] && echo "Output file fusions_discarded.tsv does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "fusions.tsv" ] && echo "Output file fusions.tsv is empty" && exit 1
|
||||
[ ! -s "fusions_discarded.tsv" ] && echo "Output file fusions_discarded.tsv is empty" && exit 1
|
||||
|
||||
echo "> Test successful"
|
||||
BIN
src/arriba/test_data/A.bam
Normal file
BIN
src/arriba/test_data/A.bam
Normal file
Binary file not shown.
6
src/arriba/test_data/annotation.gtf
Normal file
6
src/arriba/test_data/annotation.gtf
Normal file
@@ -0,0 +1,6 @@
|
||||
1 havana gene 1 80 . + . gene_id "ENSG00000000000"; gene_version "5"; gene_name "A"; gene_source "havana"; gene_biotype "gene";
|
||||
1 havana transcript 1 80 . + . gene_id "ENSG00000000000"; gene_version "5"; transcript_id "ENST00000000000"; transcript_version "2"; gene_name "A"; gene_source "havana"; gene_biotype "gene"; transcript_name "A-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; tag "basic"; transcript_support_level "1";
|
||||
1 havana exon 1 80 . + . gene_id "ENSG00000000000"; gene_version "5"; transcript_id "ENST00000000000"; transcript_version "2"; exon_number "1"; gene_name "A"; gene_source "havana"; gene_biotype "gene"; transcript_name "A-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00000000000"; exon_version "1"; tag "basic"; transcript_support_level "1";
|
||||
2 havana gene 1 80 . + . gene_id "ENSG00000000001"; gene_version "5"; gene_name "B"; gene_source "havana"; gene_biotype "gene";
|
||||
2 havana transcript 1 80 . + . gene_id "ENSG00000000001"; gene_version "5"; transcript_id "ENST00000000001"; transcript_version "2"; gene_name "B"; gene_source "havana"; gene_biotype "gene"; transcript_name "B-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; tag "basic"; transcript_support_level "1";
|
||||
2 havana exon 1 80 . + . gene_id "ENSG00000000001"; gene_version "5"; transcript_id "ENST00000000001"; transcript_version "2"; exon_number "1"; gene_name "B"; gene_source "havana"; gene_biotype "gene"; transcript_name "B-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00000000001"; exon_version "1"; tag "basic"; transcript_support_level "1";
|
||||
0
src/arriba/test_data/blacklist.tsv
Normal file
0
src/arriba/test_data/blacklist.tsv
Normal file
|
|
4
src/arriba/test_data/genome.fasta
Normal file
4
src/arriba/test_data/genome.fasta
Normal file
@@ -0,0 +1,4 @@
|
||||
>1
|
||||
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
|
||||
>2
|
||||
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
||||
10
src/arriba/test_data/script.sh
Executable file
10
src/arriba/test_data/script.sh
Executable file
@@ -0,0 +1,10 @@
|
||||
# arriba test data
|
||||
|
||||
# Test data was obtained from https://github.com/snakemake/snakemake-wrappers/tree/master/bio/arriba/test
|
||||
|
||||
if [ ! -d /tmp/snakemake-wrappers ]; then
|
||||
git clone --depth 1 --single-branch --branch master https://github.com/snakemake/snakemake-wrappers /tmp/snakemake-wrappers
|
||||
fi
|
||||
|
||||
cp -r /tmp/snakemake-wrappers/bio/arriba/test/* src/arriba/test_data
|
||||
|
||||
170
src/bcl_convert/config.vsh.yaml
Normal file
170
src/bcl_convert/config.vsh.yaml
Normal file
@@ -0,0 +1,170 @@
|
||||
name: bcl_convert
|
||||
description: |
|
||||
Convert bcl files to fastq files using bcl-convert.
|
||||
Information about upgrading from bcl2fastq via
|
||||
[Upgrading from bcl2fastq to BCL Convert](https://emea.support.illumina.com/bulletins/2020/10/upgrading-from-bcl2fastq-to-bcl-convert.html)
|
||||
and [BCL Convert Compatible Products](https://support.illumina.com/sequencing/sequencing_software/bcl-convert/compatibility.html)
|
||||
keywords: [demultiplex, fastq, bcl, illumina]
|
||||
links:
|
||||
homepage: https://support.illumina.com/sequencing/sequencing_software/bcl-convert.html
|
||||
documentation: https://support.illumina.com/downloads/bcl-convert-user-guide.html
|
||||
license: Proprietary
|
||||
authors:
|
||||
- __merge__: /src/_authors/toni_verbeiren.yaml
|
||||
roles: [ author, maintainer ]
|
||||
- __merge__: /src/_authors/dorien_roosen.yaml
|
||||
roles: [ author ]
|
||||
|
||||
argument_groups:
|
||||
- name: Input arguments
|
||||
arguments:
|
||||
- name: "--bcl_input_directory"
|
||||
alternatives: ["-i"]
|
||||
type: file
|
||||
required: true
|
||||
description: Input run directory
|
||||
example: bcl_dir
|
||||
- name: "--sample_sheet"
|
||||
alternatives: ["-s"]
|
||||
type: file
|
||||
description: Path to SampleSheet.csv file (default searched for in --bcl_input_directory)
|
||||
example: bcl_dir/sample_sheet.csv
|
||||
- name: --run_info
|
||||
type: file
|
||||
description: Path to RunInfo.xml file (default root of BCL input directory)
|
||||
example: bcl_dir/RunInfo.xml
|
||||
|
||||
- name: Lane and tile settings
|
||||
arguments:
|
||||
- name: "--bcl_only_lane"
|
||||
type: integer
|
||||
description: Convert only specified lane number (default all lanes)
|
||||
example: 1
|
||||
- name: --first_tile_only
|
||||
type: boolean
|
||||
description: Only convert first tile of input (for testing & debugging)
|
||||
example: true
|
||||
- name: --tiles
|
||||
type: string
|
||||
description: Process only a subset of tiles by a regular expression
|
||||
example: "s_[0-9]+_1"
|
||||
- name: --exclude_tiles
|
||||
type: string
|
||||
description: Exclude set of tiles by a regular expression
|
||||
example: "s_[0-9]+_1"
|
||||
|
||||
- name: Resource arguments
|
||||
arguments:
|
||||
- name: --shared_thread_odirect_output
|
||||
type: boolean
|
||||
description: Use linux native asynchronous io (io_submit) for file output (Default=false)
|
||||
example: true
|
||||
- name: --bcl_num_parallel_tiles
|
||||
type: integer
|
||||
description: "\\# of tiles to process in parallel (default 1)"
|
||||
example: 1
|
||||
- name: --bcl_num_conversion_threads
|
||||
type: integer
|
||||
description: "\\# of threads for conversion (per tile, default # cpu threads)"
|
||||
example: 1
|
||||
- name: --bcl_num_compression_threads
|
||||
type: integer
|
||||
description: "\\# of threads for fastq.gz output compression (per tile, default # cpu threads, or HW+12)"
|
||||
example: 1
|
||||
- name: --bcl_num_decompression_threads
|
||||
type: integer
|
||||
description:
|
||||
"\\# of threads for bcl/cbcl input decompression (per tile, default half # cpu threads, or HW+8).
|
||||
Only applies when preloading files"
|
||||
example: 1
|
||||
|
||||
- name: Run arguments
|
||||
arguments:
|
||||
- name: --bcl_only_matched_reads
|
||||
type: boolean
|
||||
description: For pure BCL conversion, do not output files for 'Undetermined' [unmatched] reads (output by default)
|
||||
example: true
|
||||
- name: --no_lane_splitting
|
||||
type: boolean
|
||||
description: Do not split FASTQ file by lane (false by default)
|
||||
example: true
|
||||
- name: --num_unknown_barcodes_reported
|
||||
type: integer
|
||||
description: "\\# of Top Unknown Barcodes to output (1000 by default)"
|
||||
example: 1000
|
||||
- name: --bcl_validate_sample_sheet_only
|
||||
type: boolean
|
||||
description: Only validate RunInfo.xml & SampleSheet files (produce no FASTQ files)
|
||||
example: true
|
||||
- name: --strict_mode
|
||||
type: boolean
|
||||
description: Abort if any files are missing (false by default)
|
||||
example: true
|
||||
- name: --sample_name_column_enabled
|
||||
type: boolean
|
||||
description: Use sample sheet 'Sample_Name' column when naming fastq files & subdirectories
|
||||
example: true
|
||||
|
||||
- name: Output arguments
|
||||
arguments:
|
||||
- name: "--output_directory"
|
||||
alternatives: ["-o"]
|
||||
type: file
|
||||
direction: output
|
||||
required: true
|
||||
description: Output directory containig fastq files
|
||||
example: fastq_dir
|
||||
- name: --bcl_sampleproject_subdirectories
|
||||
type: boolean
|
||||
description: Output to subdirectories based upon sample sheet 'Sample_Project' column
|
||||
example: true
|
||||
- name: --fastq_gzip_compression_level
|
||||
type: integer
|
||||
description: Set fastq output compression level 0-9 (default 1)
|
||||
example: 1
|
||||
- name: "--reports"
|
||||
type: file
|
||||
direction: output
|
||||
required: false
|
||||
description: Reports directory
|
||||
example: reports_dir
|
||||
- name: "--logs"
|
||||
type: file
|
||||
direction: output
|
||||
required: false
|
||||
description: Reports directory
|
||||
example: logs_dir
|
||||
|
||||
# bcl-convert arguments not taken into account
|
||||
# --force
|
||||
# --output-legacy-stats arg Also output stats in legacy (bcl2fastq2) format (false by default)
|
||||
# --no-sample-sheet arg Enable legacy no-sample-sheet operation (No demux or trimming. No settings
|
||||
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
|
||||
engines:
|
||||
- type: docker
|
||||
image: debian:trixie-slim
|
||||
# https://support.illumina.com/sequencing/sequencing_software/bcl-convert/downloads.html
|
||||
setup:
|
||||
- type: apt
|
||||
packages: [wget, gdb, which, hostname, alien, procps]
|
||||
- type: docker
|
||||
run: |
|
||||
wget https://s3.amazonaws.com/webdata.illumina.com/downloads/software/bcl-convert/bcl-convert-4.2.7-2.el8.x86_64.rpm -O /tmp/bcl-convert.rpm && \
|
||||
alien -i /tmp/bcl-convert.rpm && \
|
||||
rm -rf /var/lib/apt/lists/* && \
|
||||
rm /tmp/bcl-convert.rpm
|
||||
- type: docker
|
||||
run: |
|
||||
echo "bcl-convert: \"$(bcl-convert -V 2>&1 >/dev/null | sed -n '/Version/ s/^bcl-convert\ Version //p')\"" > /var/software_versions.txt
|
||||
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
38
src/bcl_convert/help.txt
Normal file
38
src/bcl_convert/help.txt
Normal file
@@ -0,0 +1,38 @@
|
||||
bcl-convert Version 00.000.000.4.2.7
|
||||
Copyright (c) 2014-2022 Illumina, Inc.
|
||||
|
||||
Run BCL Conversion (BCL directory to *.fastq.gz)
|
||||
bcl-convert --bcl-input-directory <BCL_ROOT_DIR> --output-directory <PATH> [options]
|
||||
|
||||
Options:
|
||||
-h [ --help ] Print this help message
|
||||
-V [ --version ] Print the version and exit
|
||||
--output-directory arg Output BCL directory for BCL conversion (must be specified)
|
||||
-f [ --force ] Force: allow destination diretory to already exist
|
||||
--bcl-input-directory arg Input BCL directory for BCL conversion (must be specified)
|
||||
--sample-sheet arg Path to SampleSheet.csv file (default searched for in --bcl-input-directory)
|
||||
--bcl-only-lane arg Convert only specified lane number (default all lanes)
|
||||
--strict-mode arg Abort if any files are missing (false by default)
|
||||
--first-tile-only arg Only convert first tile of input (for testing & debugging)
|
||||
--tiles arg Process only a subset of tiles by a regular expression
|
||||
--exclude-tiles arg Exclude set of tiles by a regular expression
|
||||
--bcl-sampleproject-subdirectories arg Output to subdirectories based upon sample sheet 'Sample_Project' column
|
||||
--sample-name-column-enabled arg Use sample sheet 'Sample_Name' column when naming fastq files & subdirectories
|
||||
--fastq-gzip-compression-level arg Set fastq output compression level 0-9 (default 1)
|
||||
--shared-thread-odirect-output arg Use linux native asynchronous io (io_submit) for file output (Default=false)
|
||||
--bcl-num-parallel-tiles arg # of tiles to process in parallel (default 1)
|
||||
--bcl-num-conversion-threads arg # of threads for conversion (per tile, default # cpu threads)
|
||||
--bcl-num-compression-threads arg # of threads for fastq.gz output compression (per tile, default # cpu threads,
|
||||
or HW+12)
|
||||
--bcl-num-decompression-threads arg # of threads for bcl/cbcl input decompression (per tile, default half # cpu
|
||||
threads, or HW+8. Only applies when preloading files)
|
||||
--bcl-only-matched-reads arg For pure BCL conversion, do not output files for 'Undetermined' [unmatched]
|
||||
reads (output by default)
|
||||
--run-info arg Path to RunInfo.xml file (default root of BCL input directory)
|
||||
--no-lane-splitting arg Do not split FASTQ file by lane (false by default)
|
||||
--num-unknown-barcodes-reported arg # of Top Unknown Barcodes to output (1000 by default)
|
||||
--bcl-validate-sample-sheet-only arg Only validate RunInfo.xml & SampleSheet files (produce no FASTQ files)
|
||||
--output-legacy-stats arg Also output stats in legacy (bcl2fastq2) format (false by default)
|
||||
--no-sample-sheet arg Enable legacy no-sample-sheet operation (No demux or trimming. No settings
|
||||
supported. False by default, not recommended
|
||||
|
||||
40
src/bcl_convert/script.sh
Normal file
40
src/bcl_convert/script.sh
Normal file
@@ -0,0 +1,40 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -eo pipefail
|
||||
|
||||
$(which bcl-convert) \
|
||||
--bcl-input-directory "$par_bcl_input_directory" \
|
||||
--output-directory "$par_output_directory" \
|
||||
${par_sample_sheet:+ --sample-sheet "$par_sample_sheet"} \
|
||||
${par_run_info:+ --run-info "$par_run_info"} \
|
||||
${par_bcl_only_lane:+ --bcl-only-lane "$par_bcl_only_lane"} \
|
||||
${par_first_tile_only:+ --first-tile-only "$par_first_tile_only"} \
|
||||
${par_tiles:+ --tiles "$par_tiles"} \
|
||||
${par_exclude_tiles:+ --exclude-tiles "$par_exclude_tiles"} \
|
||||
${par_shared_thread_odirect_output:+ --shared-thread-odirect-output "$par_shared_thread_odirect_output"} \
|
||||
${par_bcl_num_parallel_tiles:+ --bcl-num-parallel-tiles "$par_bcl_num_parallel_tiles"} \
|
||||
${par_bcl_num_conversion_threads:+ --bcl-num-conversion-threads "$par_bcl_num_conversion_threads"} \
|
||||
${par_bcl_num_compression_threads:+ --bcl-num-compression-threads "$par_bcl_num_compression_threads"} \
|
||||
${par_bcl_num_decompression_threads:+ --bcl-num-decompression-threads "$par_bcl_num_decompression_threads"} \
|
||||
${par_bcl_only_matched_reads:+ --bcl-only-matched-reads "$par_bcl_only_matched_reads"} \
|
||||
${par_no_lane_splitting:+ --no-lane-splitting "$par_no_lane_splitting"} \
|
||||
${par_num_unknown_barcodes_reported:+ --num-unknown-barcodes-reported "$par_num_unknown_barcodes_reported"} \
|
||||
${par_bcl_validate_sample_sheet_only:+ --bcl-validate-sample-sheet-only "$par_bcl_validate_sample_sheet_only"} \
|
||||
${par_strict_mode:+ --strict-mode "$par_strict_mode"} \
|
||||
${par_sample_name_column_enabled:+ --sample-name-column-enabled "$par_sample_name_column_enabled"} \
|
||||
${par_bcl_sampleproject_subdirectories:+ --bcl-sampleproject-subdirectories "$par_bcl_sampleproject_subdirectories"} \
|
||||
${par_fastq_gzip_compression_level:+ --fastq-gzip-compression-level "$par_fastq_gzip_compression_level"}
|
||||
|
||||
if [ ! -z "$par_reports" ]; then
|
||||
echo "Moving reports to their own location"
|
||||
mv "${par_output_directory}/Reports" "$par_reports"
|
||||
else
|
||||
echo "Leaving reports alone"
|
||||
fi
|
||||
|
||||
if [ ! -z "$par_logs" ]; then
|
||||
echo "Moving logs to their own location"
|
||||
mv "${par_output_directory}/Logs" "$par_logs"
|
||||
else
|
||||
echo "Leaving logs alone"
|
||||
fi
|
||||
70
src/bcl_convert/test.sh
Normal file
70
src/bcl_convert/test.sh
Normal file
@@ -0,0 +1,70 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Tests are sourced from:
|
||||
# https://www.10xgenomics.com/support/software/cell-ranger/latest/analysis/inputs/cr-direct-demultiplexing-bcl-convert
|
||||
# Test input files are fetched from:
|
||||
# https://cf.10xgenomics.com/supp/spatial-exp/demultiplexing/iseq-DI.tar.gz
|
||||
# https://cf.10xgenomics.com/supp/spatial-exp/demultiplexing/bcl_convert_samplesheet.csv
|
||||
|
||||
set -eo pipefail
|
||||
|
||||
echo ">> Fetching and preparing test data"
|
||||
data_src="https://cf.10xgenomics.com/supp/spatial-exp/demultiplexing/iseq-DI.tar.gz"
|
||||
sample_sheet_src="https://cf.10xgenomics.com/supp/spatial-exp/demultiplexing/bcl_convert_samplesheet.csv"
|
||||
test_data_dir="test_data"
|
||||
|
||||
mkdir $test_data_dir
|
||||
wget -q $data_src -O $test_data_dir/data.tar.gz
|
||||
wget -q $sample_sheet_src -O $test_data_dir/sample_sheet.csv
|
||||
tar xzf $test_data_dir/data.tar.gz -C $test_data_dir
|
||||
rm $test_data_dir/data.tar.gz
|
||||
|
||||
echo ">> Execute and verify output"
|
||||
|
||||
$meta_executable \
|
||||
--bcl_input_directory "$test_data_dir/iseq-DI" \
|
||||
--sample_sheet "$test_data_dir/sample_sheet.csv" \
|
||||
--output_directory fastq \
|
||||
--reports reports \
|
||||
--logs logs
|
||||
|
||||
echo ">>> Checking whether the output dir exists"
|
||||
[[ ! -d fastq ]] && echo "Output dir could not be found!" && exit 1
|
||||
|
||||
echo ">>> Checking whether output fastq files are created"
|
||||
[[ ! -f fastq/Undetermined_S0_L001_R1_001.fastq.gz ]] && echo "Output fastq files could not be found!" && exit 1
|
||||
[[ ! -f fastq/iseq-DI_S1_L001_R1_001.fastq.gz ]] && echo "Output fastq files could not be found!" && exit 1
|
||||
|
||||
echo ">>> Checking whether the report dir exists"
|
||||
[[ ! -d reports ]] && echo "Reports dir could not be found!" && exit 1
|
||||
|
||||
echo ">>> Checking whether the log dir exists"
|
||||
[[ ! -d logs ]] && echo "Logs dir could not be found!" && exit 1
|
||||
|
||||
# print final message
|
||||
echo ">>> Test finished successfully"
|
||||
|
||||
echo ">> Execute with additional arguments and verify output"
|
||||
|
||||
$meta_executable \
|
||||
--bcl_input_directory "$test_data_dir/iseq-DI" \
|
||||
--sample_sheet "$test_data_dir/sample_sheet.csv" \
|
||||
--output_directory fastq1 \
|
||||
--bcl_only_matched_reads true \
|
||||
--bcl_num_compression_threads 1 \
|
||||
--no_lane_splitting false \
|
||||
--fastq_gzip_compression_level 9
|
||||
|
||||
echo ">> Checking whether the output dir exists"
|
||||
[[ ! -d fastq1 ]] && echo "Output dir could not be found!" && exit 1
|
||||
|
||||
echo ">> Checking whether output fastq files are created"
|
||||
[[ -f fastq1/Undetermined_S0_L001_R1_001.fastq.gz ]] && echo "Undetermined should not be generated!" && exit 1
|
||||
[[ ! -f fastq1/iseq-DI_S1_L001_R1_001.fastq.gz ]] && echo "Output fastq files could not be found!" && exit 1
|
||||
|
||||
# print final message
|
||||
echo ">> Test finished successfully"
|
||||
|
||||
# do not remove this
|
||||
# as otherwise your test might exit with a different exit code
|
||||
exit 0
|
||||
143
src/bd_rhapsody/bd_rhapsody_make_reference/config.vsh.yaml
Normal file
143
src/bd_rhapsody/bd_rhapsody_make_reference/config.vsh.yaml
Normal file
@@ -0,0 +1,143 @@
|
||||
name: bd_rhapsody_make_reference
|
||||
namespace: bd_rhapsody
|
||||
description: |
|
||||
The Reference Files Generator creates an archive containing Genome Index
|
||||
and Transcriptome annotation files needed for the BD Rhapsody Sequencing
|
||||
Analysis Pipeline. The app takes as input one or more FASTA and GTF files
|
||||
and produces a compressed archive in the form of a tar.gz file. The
|
||||
archive contains:
|
||||
|
||||
- STAR index
|
||||
- Filtered GTF file
|
||||
keywords: [genome, reference, index, align]
|
||||
links:
|
||||
repository: https://bitbucket.org/CRSwDev/cwl/src/master/v2.2.1/Extra_Utilities/
|
||||
documentation: https://bd-rhapsody-bioinfo-docs.genomics.bd.com/resources/extra_utilities.html#make-rhapsody-reference
|
||||
license: Unknown
|
||||
authors:
|
||||
- __merge__: /src/_authors/robrecht_cannoodt.yaml
|
||||
roles: [ author, maintainer ]
|
||||
- __merge__: /src/_authors/weiwei_schultz.yaml
|
||||
roles: [ contributor ]
|
||||
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments:
|
||||
- type: file
|
||||
name: --genome_fasta
|
||||
required: true
|
||||
description: Reference genome file in FASTA or FASTA.GZ format. The BD Rhapsody Sequencing Analysis Pipeline uses GRCh38 for Human and GRCm39 for Mouse.
|
||||
example: genome_sequence.fa.gz
|
||||
multiple: true
|
||||
info:
|
||||
config_key: Genome_fasta
|
||||
- type: file
|
||||
name: --gtf
|
||||
required: true
|
||||
description: |
|
||||
File path to the transcript annotation files in GTF or GTF.GZ format. The Sequence Analysis Pipeline requires the 'gene_name' or
|
||||
'gene_id' attribute to be set on each gene and exon feature. Gene and exon feature lines must have the same attribute, and exons
|
||||
must have a corresponding gene with the same value. For TCR/BCR assays, the TCR or BCR gene segments must have the 'gene_type' or
|
||||
'gene_biotype' attribute set, and the value should begin with 'TR' or 'IG', respectively.
|
||||
example: transcriptome_annotation.gtf.gz
|
||||
multiple: true
|
||||
info:
|
||||
config_key: Gtf
|
||||
- type: file
|
||||
name: --extra_sequences
|
||||
description: |
|
||||
File path to additional sequences in FASTA format to use when building the STAR index. (e.g. transgenes or CRISPR guide barcodes).
|
||||
GTF lines for these sequences will be automatically generated and combined with the main GTF.
|
||||
required: false
|
||||
multiple: true
|
||||
info:
|
||||
config_key: Extra_sequences
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- type: file
|
||||
name: --reference_archive
|
||||
direction: output
|
||||
required: true
|
||||
description: |
|
||||
A Compressed archive containing the Reference Genome Index and annotation GTF files. This archive is meant to be used as an
|
||||
input in the BD Rhapsody Sequencing Analysis Pipeline.
|
||||
example: star_index.tar.gz
|
||||
- name: Arguments
|
||||
arguments:
|
||||
- type: string
|
||||
name: --mitochondrial_contigs
|
||||
description: |
|
||||
Names of the Mitochondrial contigs in the provided Reference Genome. Fragments originating from contigs other than these are
|
||||
identified as 'nuclear fragments' in the ATACseq analysis pipeline.
|
||||
required: false
|
||||
multiple: true
|
||||
default: [chrM, chrMT, M, MT]
|
||||
info:
|
||||
config_key: Mitochondrial_contigs
|
||||
- type: boolean_true
|
||||
name: --filtering_off
|
||||
description: |
|
||||
By default the input Transcript Annotation files are filtered based on the gene_type/gene_biotype attribute. Only features
|
||||
having the following attribute values are kept:
|
||||
|
||||
- protein_coding
|
||||
- lncRNA (lincRNA and antisense for Gencode < v31/M22/Ensembl97)
|
||||
- IG_LV_gene
|
||||
- IG_V_gene
|
||||
- IG_V_pseudogene
|
||||
- IG_D_gene
|
||||
- IG_J_gene
|
||||
- IG_J_pseudogene
|
||||
- IG_C_gene
|
||||
- IG_C_pseudogene
|
||||
- TR_V_gene
|
||||
- TR_V_pseudogene
|
||||
- TR_D_gene
|
||||
- TR_J_gene
|
||||
- TR_J_pseudogene
|
||||
- TR_C_gene
|
||||
|
||||
If you have already pre-filtered the input Annotation files and/or wish to turn-off the filtering, please set this option to True.
|
||||
info:
|
||||
config_key: Filtering_off
|
||||
- type: boolean_true
|
||||
name: --wta_only_index
|
||||
description: Build a WTA only index, otherwise builds a WTA + ATAC index.
|
||||
info:
|
||||
config_key: Wta_Only
|
||||
- type: string
|
||||
name: --extra_star_params
|
||||
description: Additional parameters to pass to STAR when building the genome index. Specify exactly like how you would on the command line.
|
||||
example: --limitGenomeGenerateRAM 48000 --genomeSAindexNbases 11
|
||||
required: false
|
||||
info:
|
||||
config_key: Extra_STAR_params
|
||||
|
||||
resources:
|
||||
- type: python_script
|
||||
path: script.py
|
||||
- path: make_rhap_reference_2.2.1_nodocker.cwl
|
||||
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- path: test_data
|
||||
|
||||
requirements:
|
||||
commands: [ "cwl-runner" ]
|
||||
|
||||
engines:
|
||||
- type: docker
|
||||
image: bdgenomics/rhapsody:2.2.1
|
||||
setup:
|
||||
- type: apt
|
||||
packages: [procps]
|
||||
- type: python
|
||||
packages: [cwlref-runner, cwl-runner]
|
||||
- type: docker
|
||||
run: |
|
||||
echo "bdgenomics/rhapsody: 2.2.1" > /var/software_versions.txt
|
||||
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
66
src/bd_rhapsody/bd_rhapsody_make_reference/help.txt
Normal file
66
src/bd_rhapsody/bd_rhapsody_make_reference/help.txt
Normal file
@@ -0,0 +1,66 @@
|
||||
```bash
|
||||
cwl-runner src/bd_rhapsody/bd_rhapsody_make_reference/make_rhap_reference_2.2.1_nodocker.cwl --help
|
||||
```
|
||||
|
||||
usage: src/bd_rhapsody/bd_rhapsody_make_reference/make_rhap_reference_2.2.1_nodocker.cwl
|
||||
[-h] [--Archive_prefix ARCHIVE_PREFIX]
|
||||
[--Extra_STAR_params EXTRA_STAR_PARAMS]
|
||||
[--Extra_sequences EXTRA_SEQUENCES] [--Filtering_off] --Genome_fasta
|
||||
GENOME_FASTA --Gtf GTF [--Maximum_threads MAXIMUM_THREADS]
|
||||
[--Mitochondrial_Contigs MITOCHONDRIAL_CONTIGS] [--WTA_Only]
|
||||
[job_order]
|
||||
|
||||
The Reference Files Generator creates an archive containing Genome Index and
|
||||
Transcriptome annotation files needed for the BD Rhapsodyâ„¢ Sequencing
|
||||
Analysis Pipeline. The app takes as input one or more FASTA and GTF files and
|
||||
produces a compressed archive in the form of a tar.gz file. The archive
|
||||
contains:\n - STAR index\n - Filtered GTF file
|
||||
|
||||
positional arguments:
|
||||
job_order Job input json file
|
||||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
--Archive_prefix ARCHIVE_PREFIX
|
||||
A prefix for naming the compressed archive file
|
||||
containing the Reference genome index and annotation
|
||||
files. The default value is constructed based on the
|
||||
input Reference files.
|
||||
--Extra_STAR_params EXTRA_STAR_PARAMS
|
||||
Additional parameters to pass to STAR when building
|
||||
the genome index. Specify exactly like how you would
|
||||
on the command line. Example: --limitGenomeGenerateRAM
|
||||
48000 --genomeSAindexNbases 11
|
||||
--Extra_sequences EXTRA_SEQUENCES
|
||||
Additional sequences in FASTA format to use when
|
||||
building the STAR index. (E.g. phiX genome)
|
||||
--Filtering_off By default the input Transcript Annotation files are
|
||||
filtered based on the gene_type/gene_biotype
|
||||
attribute. Only features having the following
|
||||
attribute values are are kept: - protein_coding -
|
||||
lncRNA (lincRNA and antisense for Gencode <
|
||||
v31/M22/Ensembl97) - IG_LV_gene - IG_V_gene -
|
||||
IG_V_pseudogene - IG_D_gene - IG_J_gene -
|
||||
IG_J_pseudogene - IG_C_gene - IG_C_pseudogene -
|
||||
TR_V_gene - TR_V_pseudogene - TR_D_gene - TR_J_gene -
|
||||
TR_J_pseudogene - TR_C_gene If you have already pre-
|
||||
filtered the input Annotation files and/or wish to
|
||||
turn-off the filtering, please set this option to
|
||||
True.
|
||||
--Genome_fasta GENOME_FASTA
|
||||
Reference genome file in FASTA format. The BD
|
||||
Rhapsodyâ„¢ Sequencing Analysis Pipeline uses GRCh38
|
||||
for Human and GRCm39 for Mouse.
|
||||
--Gtf GTF Transcript annotation files in GTF format. The BD
|
||||
Rhapsodyâ„¢ Sequencing Analysis Pipeline uses Gencode
|
||||
v42 for Human and M31 for Mouse.
|
||||
--Maximum_threads MAXIMUM_THREADS
|
||||
The maximum number of threads to use in the pipeline.
|
||||
By default, all available cores are used.
|
||||
--Mitochondrial_Contigs MITOCHONDRIAL_CONTIGS
|
||||
Names of the Mitochondrial contigs in the provided
|
||||
Reference Genome. Fragments originating from contigs
|
||||
other than these are identified as 'nuclear fragments'
|
||||
in the ATACseq analysis pipeline.
|
||||
--WTA_Only Build a WTA only index, otherwise builds a WTA + ATAC
|
||||
index.
|
||||
@@ -0,0 +1,115 @@
|
||||
requirements:
|
||||
InlineJavascriptRequirement: {}
|
||||
class: CommandLineTool
|
||||
label: Reference Files Generator for BD Rhapsodyâ„¢ Sequencing Analysis Pipeline
|
||||
cwlVersion: v1.2
|
||||
doc: >-
|
||||
The Reference Files Generator creates an archive containing Genome Index and Transcriptome annotation files needed for the BD Rhapsodyâ„¢ Sequencing Analysis Pipeline. The app takes as input one or more FASTA and GTF files and produces a compressed archive in the form of a tar.gz file. The archive contains:\n - STAR index\n - Filtered GTF file
|
||||
|
||||
|
||||
baseCommand: run_reference_generator.sh
|
||||
inputs:
|
||||
Genome_fasta:
|
||||
type: File[]
|
||||
label: Reference Genome
|
||||
doc: |-
|
||||
Reference genome file in FASTA format. The BD Rhapsodyâ„¢ Sequencing Analysis Pipeline uses GRCh38 for Human and GRCm39 for Mouse.
|
||||
inputBinding:
|
||||
prefix: --reference-genome
|
||||
shellQuote: false
|
||||
Gtf:
|
||||
type: File[]
|
||||
label: Transcript Annotations
|
||||
doc: |-
|
||||
Transcript annotation files in GTF format. The BD Rhapsodyâ„¢ Sequencing Analysis Pipeline uses Gencode v42 for Human and M31 for Mouse.
|
||||
inputBinding:
|
||||
prefix: --gtf
|
||||
shellQuote: false
|
||||
Extra_sequences:
|
||||
type: File[]?
|
||||
label: Extra Sequences
|
||||
doc: |-
|
||||
Additional sequences in FASTA format to use when building the STAR index. (E.g. phiX genome)
|
||||
inputBinding:
|
||||
prefix: --extra-sequences
|
||||
shellQuote: false
|
||||
Mitochondrial_Contigs:
|
||||
type: string[]?
|
||||
default: ["chrM", "chrMT", "M", "MT"]
|
||||
label: Mitochondrial Contig Names
|
||||
doc: |-
|
||||
Names of the Mitochondrial contigs in the provided Reference Genome. Fragments originating from contigs other than these are identified as 'nuclear fragments' in the ATACseq analysis pipeline.
|
||||
inputBinding:
|
||||
prefix: --mitochondrial-contigs
|
||||
shellQuote: false
|
||||
Filtering_off:
|
||||
type: boolean?
|
||||
label: Turn off filtering
|
||||
doc: |-
|
||||
By default the input Transcript Annotation files are filtered based on the gene_type/gene_biotype attribute. Only features having the following attribute values are are kept:
|
||||
- protein_coding
|
||||
- lncRNA (lincRNA and antisense for Gencode < v31/M22/Ensembl97)
|
||||
- IG_LV_gene
|
||||
- IG_V_gene
|
||||
- IG_V_pseudogene
|
||||
- IG_D_gene
|
||||
- IG_J_gene
|
||||
- IG_J_pseudogene
|
||||
- IG_C_gene
|
||||
- IG_C_pseudogene
|
||||
- TR_V_gene
|
||||
- TR_V_pseudogene
|
||||
- TR_D_gene
|
||||
- TR_J_gene
|
||||
- TR_J_pseudogene
|
||||
- TR_C_gene
|
||||
If you have already pre-filtered the input Annotation files and/or wish to turn-off the filtering, please set this option to True.
|
||||
inputBinding:
|
||||
prefix: --filtering-off
|
||||
shellQuote: false
|
||||
WTA_Only:
|
||||
type: boolean?
|
||||
label: WTA only index
|
||||
doc: Build a WTA only index, otherwise builds a WTA + ATAC index.
|
||||
inputBinding:
|
||||
prefix: --wta-only-index
|
||||
shellQuote: false
|
||||
Archive_prefix:
|
||||
type: string?
|
||||
label: Archive Prefix
|
||||
doc: |-
|
||||
A prefix for naming the compressed archive file containing the Reference genome index and annotation files. The default value is constructed based on the input Reference files.
|
||||
inputBinding:
|
||||
prefix: --archive-prefix
|
||||
shellQuote: false
|
||||
Extra_STAR_params:
|
||||
type: string?
|
||||
label: Extra STAR Params
|
||||
doc: |-
|
||||
Additional parameters to pass to STAR when building the genome index. Specify exactly like how you would on the command line.
|
||||
Example:
|
||||
--limitGenomeGenerateRAM 48000 --genomeSAindexNbases 11
|
||||
inputBinding:
|
||||
prefix: --extra-star-params
|
||||
shellQuote: true
|
||||
|
||||
Maximum_threads:
|
||||
type: int?
|
||||
label: Maximum Number of Threads
|
||||
doc: |-
|
||||
The maximum number of threads to use in the pipeline. By default, all available cores are used.
|
||||
inputBinding:
|
||||
prefix: --maximum-threads
|
||||
shellQuote: false
|
||||
|
||||
outputs:
|
||||
|
||||
Archive:
|
||||
type: File
|
||||
doc: |-
|
||||
A Compressed archive containing the Reference Genome Index and annotation GTF files. This archive is meant to be used as an input in the BD Rhapsodyâ„¢ Sequencing Analysis Pipeline.
|
||||
id: Reference_Archive
|
||||
label: Reference Files Archive
|
||||
outputBinding:
|
||||
glob: '*.tar.gz'
|
||||
|
||||
161
src/bd_rhapsody/bd_rhapsody_make_reference/script.py
Normal file
161
src/bd_rhapsody/bd_rhapsody_make_reference/script.py
Normal file
@@ -0,0 +1,161 @@
|
||||
import os
|
||||
import re
|
||||
import subprocess
|
||||
import tempfile
|
||||
from typing import Any
|
||||
import yaml
|
||||
import shutil
|
||||
|
||||
## VIASH START
|
||||
par = {
|
||||
"genome_fasta": [],
|
||||
"gtf": [],
|
||||
"extra_sequences": [],
|
||||
"mitochondrial_contigs": ["chrM", "chrMT", "M", "MT"],
|
||||
"filtering_off": False,
|
||||
"wta_only_index": False,
|
||||
"extra_star_params": None,
|
||||
"reference_archive": "output.tar.gz",
|
||||
}
|
||||
meta = {
|
||||
"config": "target/nextflow/reference/build_bdrhap_2_reference/.config.vsh.yaml",
|
||||
"resources_dir": os.path.abspath("src/reference/build_bdrhap_2_reference"),
|
||||
"temp_dir": os.getenv("VIASH_TEMP"),
|
||||
"memory_mb": None,
|
||||
"cpus": None
|
||||
}
|
||||
## VIASH END
|
||||
|
||||
def clean_arg(argument):
|
||||
argument["clean_name"] = re.sub("^-*", "", argument["name"])
|
||||
return argument
|
||||
|
||||
def read_config(path: str) -> dict[str, Any]:
|
||||
with open(path, "r") as f:
|
||||
config = yaml.safe_load(f)
|
||||
|
||||
config["all_arguments"] = [
|
||||
clean_arg(arg)
|
||||
for grp in config["argument_groups"]
|
||||
for arg in grp["arguments"]
|
||||
]
|
||||
|
||||
return config
|
||||
|
||||
def strip_margin(text: str) -> str:
|
||||
return re.sub("(\n?)[ \t]*\|", "\\1", text)
|
||||
|
||||
def process_params(par: dict[str, Any], config) -> str:
|
||||
# check input parameters
|
||||
assert par["genome_fasta"], "Pass at least one set of inputs to --genome_fasta."
|
||||
assert par["gtf"], "Pass at least one set of inputs to --gtf."
|
||||
assert par["reference_archive"].endswith(".tar.gz"), "Output reference_archive must end with .tar.gz."
|
||||
|
||||
# make paths absolute
|
||||
for argument in config["all_arguments"]:
|
||||
if par[argument["clean_name"]] and argument["type"] == "file":
|
||||
if isinstance(par[argument["clean_name"]], list):
|
||||
par[argument["clean_name"]] = [ os.path.abspath(f) for f in par[argument["clean_name"]] ]
|
||||
else:
|
||||
par[argument["clean_name"]] = os.path.abspath(par[argument["clean_name"]])
|
||||
|
||||
return par
|
||||
|
||||
def generate_config(par: dict[str, Any], meta, config) -> str:
|
||||
content_list = [strip_margin(f"""\
|
||||
|#!/usr/bin/env cwl-runner
|
||||
|
|
||||
|""")]
|
||||
|
||||
|
||||
config_key_value_pairs = []
|
||||
for argument in config["all_arguments"]:
|
||||
config_key = (argument.get("info") or {}).get("config_key")
|
||||
arg_type = argument["type"]
|
||||
par_value = par[argument["clean_name"]]
|
||||
if par_value and config_key:
|
||||
config_key_value_pairs.append((config_key, arg_type, par_value))
|
||||
|
||||
if meta["cpus"]:
|
||||
config_key_value_pairs.append(("Maximum_threads", "integer", meta["cpus"]))
|
||||
|
||||
# print(config_key_value_pairs)
|
||||
|
||||
for config_key, arg_type, par_value in config_key_value_pairs:
|
||||
if arg_type == "file":
|
||||
str = strip_margin(f"""\
|
||||
|{config_key}:
|
||||
|""")
|
||||
if isinstance(par_value, list):
|
||||
for file in par_value:
|
||||
str += strip_margin(f"""\
|
||||
| - class: File
|
||||
| location: "{file}"
|
||||
|""")
|
||||
else:
|
||||
str += strip_margin(f"""\
|
||||
| class: File
|
||||
| location: "{par_value}"
|
||||
|""")
|
||||
content_list.append(str)
|
||||
else:
|
||||
content_list.append(strip_margin(f"""\
|
||||
|{config_key}: {par_value}
|
||||
|"""))
|
||||
|
||||
## Write config to file
|
||||
return "".join(content_list)
|
||||
|
||||
def get_cwl_file(meta: dict[str, Any]) -> str:
|
||||
# create cwl file (if need be)
|
||||
cwl_file=os.path.join(meta["resources_dir"], "make_rhap_reference_2.2.1_nodocker.cwl")
|
||||
|
||||
return cwl_file
|
||||
|
||||
def main(par: dict[str, Any], meta: dict[str, Any]):
|
||||
config = read_config(meta["config"])
|
||||
|
||||
# Preprocess params
|
||||
par = process_params(par, config)
|
||||
|
||||
# fetch cwl file
|
||||
cwl_file = get_cwl_file(meta)
|
||||
|
||||
# Create output dir if not exists
|
||||
outdir = os.path.dirname(par["reference_archive"])
|
||||
if not os.path.exists(outdir):
|
||||
os.makedirs(outdir)
|
||||
|
||||
## Run pipeline
|
||||
with tempfile.TemporaryDirectory(prefix="cwl-bd_rhapsody_wta-", dir=meta["temp_dir"]) as temp_dir:
|
||||
# Create params file
|
||||
config_file = os.path.join(temp_dir, "config.yml")
|
||||
config_content = generate_config(par, meta, config)
|
||||
with open(config_file, "w") as f:
|
||||
f.write(config_content)
|
||||
|
||||
|
||||
cmd = [
|
||||
"cwl-runner",
|
||||
"--no-container",
|
||||
"--preserve-entire-environment",
|
||||
"--outdir",
|
||||
temp_dir,
|
||||
cwl_file,
|
||||
config_file
|
||||
]
|
||||
|
||||
env = dict(os.environ)
|
||||
env["TMPDIR"] = temp_dir
|
||||
|
||||
print("> " + " ".join(cmd), flush=True)
|
||||
_ = subprocess.check_call(
|
||||
cmd,
|
||||
cwd=os.path.dirname(config_file),
|
||||
env=env
|
||||
)
|
||||
|
||||
shutil.move(os.path.join(temp_dir, "Rhap_reference.tar.gz"), par["reference_archive"])
|
||||
|
||||
if __name__ == "__main__":
|
||||
main(par, meta)
|
||||
65
src/bd_rhapsody/bd_rhapsody_make_reference/test.sh
Normal file
65
src/bd_rhapsody/bd_rhapsody_make_reference/test.sh
Normal file
@@ -0,0 +1,65 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
#############################################
|
||||
# helper functions
|
||||
assert_file_exists() {
|
||||
[ -f "$1" ] || { echo "File '$1' does not exist" && exit 1; }
|
||||
}
|
||||
assert_file_doesnt_exist() {
|
||||
[ ! -f "$1" ] || { echo "File '$1' exists but shouldn't" && exit 1; }
|
||||
}
|
||||
assert_file_empty() {
|
||||
[ ! -s "$1" ] || { echo "File '$1' is not empty but should be" && exit 1; }
|
||||
}
|
||||
assert_file_not_empty() {
|
||||
[ -s "$1" ] || { echo "File '$1' is empty but shouldn't be" && exit 1; }
|
||||
}
|
||||
assert_file_contains() {
|
||||
grep -q "$2" "$1" || { echo "File '$1' does not contain '$2'" && exit 1; }
|
||||
}
|
||||
assert_file_not_contains() {
|
||||
grep -q "$2" "$1" && { echo "File '$1' contains '$2' but shouldn't" && exit 1; }
|
||||
}
|
||||
#############################################
|
||||
|
||||
in_fa="$meta_resources_dir/test_data/reference_small.fa"
|
||||
in_gtf="$meta_resources_dir/test_data/reference_small.gtf"
|
||||
|
||||
echo "#############################################"
|
||||
echo "> Simple run"
|
||||
|
||||
mkdir simple_run
|
||||
cd simple_run
|
||||
|
||||
out_tar="myreference.tar.gz"
|
||||
|
||||
echo "> Running $meta_name."
|
||||
$meta_executable \
|
||||
--genome_fasta "$in_fa" \
|
||||
--gtf "$in_gtf" \
|
||||
--reference_archive "$out_tar" \
|
||||
--extra_star_params "--genomeSAindexNbases 6" \
|
||||
---cpus 2
|
||||
|
||||
exit_code=$?
|
||||
[[ $exit_code != 0 ]] && echo "Non zero exit code: $exit_code" && exit 1
|
||||
|
||||
assert_file_exists "$out_tar"
|
||||
assert_file_not_empty "$out_tar"
|
||||
|
||||
echo ">> Checking whether output contains the expected files"
|
||||
tar -xvf "$out_tar" > /dev/null
|
||||
assert_file_exists "BD_Rhapsody_Reference_Files/star_index/genomeParameters.txt"
|
||||
assert_file_exists "BD_Rhapsody_Reference_Files/bwa-mem2_index/reference_small.ann"
|
||||
assert_file_exists "BD_Rhapsody_Reference_Files/reference_small-processed.gtf"
|
||||
assert_file_exists "BD_Rhapsody_Reference_Files/mitochondrial_contigs.txt"
|
||||
assert_file_contains "BD_Rhapsody_Reference_Files/reference_small-processed.gtf" "chr1.*HAVANA.*ENSG00000243485"
|
||||
assert_file_contains "BD_Rhapsody_Reference_Files/mitochondrial_contigs.txt" 'chrMT'
|
||||
|
||||
cd ..
|
||||
|
||||
echo "#############################################"
|
||||
|
||||
echo "> Tests succeeded!"
|
||||
@@ -0,0 +1,27 @@
|
||||
>chr1 1
|
||||
TGGGGAAGCAAGGCGGAGTTGGGCAGCTCGTGTTCAATGGGTAGAGTTTCAGGCTGGGGT
|
||||
GATGGAAGGGTGCTGGAAATGAGTGGTAGTGATGGCGGCACAACAGTGTGAATCTACTTA
|
||||
ATCCCACTGAACTGTATGCTGAAAAATGGTTTAGACGGTGAATTTTAGGTTATGTATGTT
|
||||
TTACCACAATTTTTAAAAAGCTAGTGAAAAGCTGGTAAAAAGAAAGAAAAGAGGCTTTTT
|
||||
TAAAAAGTTAAATATATAAAAAGAGCATCATCAGTCCAAAGTCCAGCAGTTGTCCCTCCT
|
||||
GGAATCCGTTGGCTTGCCTCCGGCATTTTTGGCCCTTGCCTTTTAGGGTTGCCAGATTAA
|
||||
AAGACAGGATGCCCAGCTAGTTTGAATTTTAGATAAACAACGAATAATTTCGTAGCATAA
|
||||
ATATGTCCCAAGCTTAGTTTGGGACATACTTATGCTAAAAAACATTATTGGTTGTTTATC
|
||||
TGAGATTCAGAATTAAGCATTTTATATTTTATTTGCTGCCTCTGGCCACCCTACTCTCTT
|
||||
CCTAACACTCTCTCCCTCTCCCAGTTTTGTCCGCCTTCCCTGCCTCCTCTTCTGGGGGAG
|
||||
TTAGATCGAGTTGTAACAAGAACATGCCACTGTCTCGCTGGCTGCAGCGTGTGGTCCCCT
|
||||
TACCAGAGGTAAAGAAGAGATGGATCTCCACTCATGTTGTAGACAGAATGTTTATGTCCT
|
||||
CTCCAAATGCTTATGTTGAAACCCTAACCCCTAATGTGATGGTATGTGGAGATGGGCCTT
|
||||
TGGTAGGTAATTACGGTTAGATGAGGTCATGGGGTGGGGCCCTCATTATAGATCTGGTAA
|
||||
GAAAAGAGAGCATTGTCTCTGTGTCTCCCTCTCTCTCTCTCTCTCTCTCTCTCATTTCTC
|
||||
TCTATCTCATTTCTCTCTCTCTCGCTATCTCATTTTTCTCTCTCTCTCTTTCTCTCCTCT
|
||||
GTCTTTTCCCACCAAGTGAGGATGCGAAGAGAAGGTGGCTGTCTGCAAACCAGGAAGAGA
|
||||
GCCCTCACCGGGAACCCGTCCAGCTGCCACCTTGAACTTGGACTTCCAAGCCTCCAGAAC
|
||||
TGTGAGGGATAAATGTATGATTTTAAAGTCGCCCAGTGTGTGGTATTTTGTTTTGACTAA
|
||||
TACAACCTGAAAACATTTTCCCCTCACTCCACCTGAGCAATATCTGAGTGGCTTAAGGTA
|
||||
CTCAGGACACAACAAAGGAGAAATGTCCCATGCACAAGGTGCACCCATGCCTGGGTAAAG
|
||||
CAGCCTGGCACAGAGGGAAGCACACAGGCTCAGGGATCTGCTATTCATTCTTTGTGTGAC
|
||||
CCTGGGCAAGCCATGAATGGAGCTTCAGTCACCCCATTTGTAATGGGATTTAATTGTGCT
|
||||
TGCCCTGCCTCCTTTTGAGGGCTGTAGAGAAAAGATGTCAAAGTATTTTGTAATCTGGCT
|
||||
GGGCGTGGTGGCTCATGCCTGTAATCCTAGCACTTTGGTAGGCTGACGCGAGAGGACTGC
|
||||
T
|
||||
@@ -0,0 +1,8 @@
|
||||
chr1 HAVANA exon 565 668 . + . gene_id "ENSG00000243485.5"; transcript_id "ENST00000473358.1"; gene_type "lncRNA"; gene_name "MIR1302-2HG"; transcript_type "lncRNA"; transcript_name "MIR1302-2HG-202"; exon_number 2; exon_id "ENSE00001922571.1"; level 2; transcript_support_level "5"; hgnc_id "HGNC:52482"; tag "not_best_in_genome_evidence"; tag "dotter_confirmed"; tag "basic"; tag "Ensembl_canonical"; havana_gene "OTTHUMG00000000959.2"; havana_transcript "OTTHUMT00000002840.1";
|
||||
chr1 HAVANA exon 977 1098 . + . gene_id "ENSG00000243485.5"; transcript_id "ENST00000473358.1"; gene_type "lncRNA"; gene_name "MIR1302-2HG"; transcript_type "lncRNA"; transcript_name "MIR1302-2HG-202"; exon_number 3; exon_id "ENSE00001827679.1"; level 2; transcript_support_level "5"; hgnc_id "HGNC:52482"; tag "not_best_in_genome_evidence"; tag "dotter_confirmed"; tag "basic"; tag "Ensembl_canonical"; havana_gene "OTTHUMG00000000959.2"; havana_transcript "OTTHUMT00000002840.1";
|
||||
chr1 HAVANA transcript 268 1110 . + . gene_id "ENSG00000243485.5"; transcript_id "ENST00000469289.1"; gene_type "lncRNA"; gene_name "MIR1302-2HG"; transcript_type "lncRNA"; transcript_name "MIR1302-2HG-201"; level 2; transcript_support_level "5"; hgnc_id "HGNC:52482"; tag "not_best_in_genome_evidence"; tag "basic"; havana_gene "OTTHUMG00000000959.2"; havana_transcript "OTTHUMT00000002841.2";
|
||||
chr1 HAVANA exon 268 668 . + . gene_id "ENSG00000243485.5"; transcript_id "ENST00000469289.1"; gene_type "lncRNA"; gene_name "MIR1302-2HG"; transcript_type "lncRNA"; transcript_name "MIR1302-2HG-201"; exon_number 1; exon_id "ENSE00001841699.1"; level 2; transcript_support_level "5"; hgnc_id "HGNC:52482"; tag "not_best_in_genome_evidence"; tag "basic"; havana_gene "OTTHUMG00000000959.2"; havana_transcript "OTTHUMT00000002841.2";
|
||||
chr1 HAVANA exon 977 1110 . + . gene_id "ENSG00000243485.5"; transcript_id "ENST00000469289.1"; gene_type "lncRNA"; gene_name "MIR1302-2HG"; transcript_type "lncRNA"; transcript_name "MIR1302-2HG-201"; exon_number 2; exon_id "ENSE00001890064.1"; level 2; transcript_support_level "5"; hgnc_id "HGNC:52482"; tag "not_best_in_genome_evidence"; tag "basic"; havana_gene "OTTHUMG00000000959.2"; havana_transcript "OTTHUMT00000002841.2";
|
||||
chr1 ENSEMBL gene 367 504 . + . gene_id "ENSG00000284332.1"; gene_type "miRNA"; gene_name "MIR1302-2"; level 3; hgnc_id "HGNC:35294";
|
||||
chr1 ENSEMBL transcript 367 504 . + . gene_id "ENSG00000284332.1"; transcript_id "ENST00000607096.1"; gene_type "miRNA"; gene_name "MIR1302-2"; transcript_type "miRNA"; transcript_name "MIR1302-2-201"; level 3; transcript_support_level "NA"; hgnc_id "HGNC:35294"; tag "basic"; tag "Ensembl_canonical";
|
||||
chr1 ENSEMBL exon 367 504 . + . gene_id "ENSG00000284332.1"; transcript_id "ENST00000607096.1"; gene_type "miRNA"; gene_name "MIR1302-2"; transcript_type "miRNA"; transcript_name "MIR1302-2-201"; exon_number 1; exon_id "ENSE00003695741.1"; level 3; transcript_support_level "NA"; hgnc_id "HGNC:35294"; tag "basic"; tag "Ensembl_canonical";
|
||||
@@ -0,0 +1,47 @@
|
||||
#!/bin/bash
|
||||
|
||||
TMP_DIR=/tmp/bd_rhapsody_make_reference
|
||||
OUT_DIR=src/bd_rhapsody/bd_rhapsody_make_reference/test_data
|
||||
|
||||
# check if seqkit is installed
|
||||
if ! command -v seqkit &> /dev/null; then
|
||||
echo "seqkit could not be found"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# create temporary directory and clean up on exit
|
||||
mkdir -p $TMP_DIR
|
||||
function clean_up {
|
||||
rm -rf "$TMP_DIR"
|
||||
}
|
||||
trap clean_up EXIT
|
||||
|
||||
# fetch reference
|
||||
ORIG_FA=$TMP_DIR/reference.fa.gz
|
||||
if [ ! -f $ORIG_FA ]; then
|
||||
wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_41/GRCh38.primary_assembly.genome.fa.gz \
|
||||
-O $ORIG_FA
|
||||
fi
|
||||
|
||||
ORIG_GTF=$TMP_DIR/reference.gtf.gz
|
||||
if [ ! -f $ORIG_GTF ]; then
|
||||
wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_41/gencode.v41.annotation.gtf.gz \
|
||||
-O $ORIG_GTF
|
||||
fi
|
||||
|
||||
# create small reference
|
||||
START=30000
|
||||
END=31500
|
||||
CHR=chr1
|
||||
|
||||
# subset to small region
|
||||
seqkit grep -r -p "^$CHR\$" "$ORIG_FA" | \
|
||||
seqkit subseq -r "$START:$END" > $OUT_DIR/reference_small.fa
|
||||
|
||||
zcat "$ORIG_GTF" | \
|
||||
awk -v FS='\t' -v OFS='\t' "
|
||||
\$1 == \"$CHR\" && \$4 >= $START && \$5 <= $END {
|
||||
\$4 = \$4 - $START + 1;
|
||||
\$5 = \$5 - $START + 1;
|
||||
print;
|
||||
}" > $OUT_DIR/reference_small.gtf
|
||||
106
src/bedtools/bedtools_getfasta/config.vsh.yaml
Normal file
106
src/bedtools/bedtools_getfasta/config.vsh.yaml
Normal file
@@ -0,0 +1,106 @@
|
||||
name: bedtools_getfasta
|
||||
namespace: bedtools
|
||||
description: Extract sequences from a FASTA file for each of the intervals defined in a BED/GFF/VCF file.
|
||||
keywords: [sequencing, fasta, BED, GFF, VCF]
|
||||
links:
|
||||
documentation: https://bedtools.readthedocs.io/en/latest/content/tools/getfasta.html
|
||||
repository: https://github.com/arq5x/bedtools2
|
||||
references:
|
||||
doi: 10.1093/bioinformatics/btq033
|
||||
license: GPL-2.0
|
||||
requirements:
|
||||
commands: [bedtools]
|
||||
authors:
|
||||
- __merge__: /src/_authors/dries_schaumont.yaml
|
||||
roles: [ author, maintainer ]
|
||||
|
||||
argument_groups:
|
||||
- name: Input arguments
|
||||
arguments:
|
||||
- name: --input_fasta
|
||||
type: file
|
||||
description: |
|
||||
FASTA file containing sequences for each interval specified in the input BED file.
|
||||
The headers in the input FASTA file must exactly match the chromosome column in the BED file.
|
||||
- name: "--input_bed"
|
||||
type: file
|
||||
description: |
|
||||
BED file containing intervals to extract from the FASTA file.
|
||||
BED files containing a single region require a newline character
|
||||
at the end of the line, otherwise a blank output file is produced.
|
||||
- name: --rna
|
||||
type: boolean_true
|
||||
description: |
|
||||
The FASTA is RNA not DNA. Reverse complementation handled accordingly.
|
||||
|
||||
- name: Run arguments
|
||||
arguments:
|
||||
- name: "--strandedness"
|
||||
type: boolean_true
|
||||
alternatives: ["-s"]
|
||||
description: |
|
||||
Force strandedness. If the feature occupies the antisense strand, the output sequence will
|
||||
be reverse complemented. By default strandedness is not taken into account.
|
||||
|
||||
- name: Output arguments
|
||||
arguments:
|
||||
- name: --output
|
||||
alternatives: [-o]
|
||||
required: true
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Output file where the output from the 'bedtools getfasta' commend will
|
||||
be written to.
|
||||
- name: --tab
|
||||
type: boolean_true
|
||||
description: |
|
||||
Report extract sequences in a tab-delimited format instead of in FASTA format.
|
||||
- name: --bed_out
|
||||
type: boolean_true
|
||||
description: |
|
||||
Report extract sequences in a tab-delimited BED format instead of in FASTA format.
|
||||
- name: "--name"
|
||||
type: boolean_true
|
||||
description: |
|
||||
Set the FASTA header for each extracted sequence to be the "name" and coordinate columns from the BED feature.
|
||||
- name: "--name_only"
|
||||
type: boolean_true
|
||||
description: |
|
||||
Set the FASTA header for each extracted sequence to be the "name" columns from the BED feature.
|
||||
- name: "--split"
|
||||
type: boolean_true
|
||||
description: |
|
||||
When --input is in BED12 format, create a separate fasta entry for each block in a BED12 record,
|
||||
blocks being described in the 11th and 12th column of the BED.
|
||||
- name: "--full_header"
|
||||
type: boolean_true
|
||||
description: |
|
||||
Use full fasta header. By default, only the word before the first space or tab is used.
|
||||
|
||||
# Arguments not taken into account:
|
||||
#
|
||||
# -fo [Specify an output file name. By default, output goes to stdout.
|
||||
#
|
||||
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
|
||||
engines:
|
||||
- type: docker
|
||||
image: debian:stable-slim
|
||||
setup:
|
||||
- type: apt
|
||||
packages: [bedtools, procps]
|
||||
- type: docker
|
||||
run: |
|
||||
echo "bedtools: \"$(bedtools --version | sed -n 's/^bedtools //p')\"" > /var/software_versions.txt
|
||||
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
22
src/bedtools/bedtools_getfasta/script.sh
Normal file
22
src/bedtools/bedtools_getfasta/script.sh
Normal file
@@ -0,0 +1,22 @@
|
||||
#!/usr/bin/env bash
|
||||
set -eo pipefail
|
||||
|
||||
unset_if_false=( par_rna par_strandedness par_tab par_bed_out par_name par_name_only par_split par_full_header )
|
||||
|
||||
for par in ${unset_if_false[@]}; do
|
||||
test_val="${!par}"
|
||||
[[ "$test_val" == "false" ]] && unset $par
|
||||
done
|
||||
|
||||
bedtools getfasta \
|
||||
-fi "$par_input_fasta" \
|
||||
-bed "$par_input_bed" \
|
||||
${par_rna:+-rna} \
|
||||
${par_name:+-name} \
|
||||
${par_name_only:+-nameOnly} \
|
||||
${par_tab:+-tab} \
|
||||
${par_bed_out:+-bedOut} \
|
||||
${par_strandedness:+-s} \
|
||||
${par_split:+-split} \
|
||||
${par_full_header:+-fullHeader} > "$par_output"
|
||||
|
||||
119
src/bedtools/bedtools_getfasta/test.sh
Normal file
119
src/bedtools/bedtools_getfasta/test.sh
Normal file
@@ -0,0 +1,119 @@
|
||||
#!/usr/bin/env bash
|
||||
set -eo pipefail
|
||||
|
||||
TMPDIR=$(mktemp -d)
|
||||
function clean_up {
|
||||
[[ -d "$TMPDIR" ]] && rm -r "$TMPDIR"
|
||||
}
|
||||
trap clean_up EXIT
|
||||
|
||||
# Create dummy test fasta file
|
||||
cat > "$TMPDIR/test.fa" <<EOF
|
||||
>chr1
|
||||
AAAAAAAACCCCCCCCCCCCCGCTACTGGGGGGGGGGGGGGGGGG
|
||||
EOF
|
||||
|
||||
TAB="$(printf '\t')"
|
||||
|
||||
# Create dummy bed file
|
||||
cat > "$TMPDIR/test.bed" <<EOF
|
||||
chr1${TAB}5${TAB}10${TAB}myseq
|
||||
EOF
|
||||
|
||||
# Create expected bed file
|
||||
cat > "$TMPDIR/expected.fasta" <<EOF
|
||||
>chr1:5-10
|
||||
AAACC
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--input_bed "$TMPDIR/test.bed" \
|
||||
--input_fasta "$TMPDIR/test.fa" \
|
||||
--output "$TMPDIR/output.fasta"
|
||||
|
||||
cmp --silent "$TMPDIR/output.fasta" "$TMPDIR/expected.fasta" || { echo "files are different:"; exit 1; }
|
||||
|
||||
|
||||
# Create expected bed file for --name
|
||||
cat > "$TMPDIR/expected_with_name.fasta" <<EOF
|
||||
>myseq::chr1:5-10
|
||||
AAACC
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--input_bed "$TMPDIR/test.bed" \
|
||||
--input_fasta "$TMPDIR/test.fa" \
|
||||
--name \
|
||||
--output "$TMPDIR/output_with_name.fasta"
|
||||
|
||||
|
||||
cmp --silent "$TMPDIR/output_with_name.fasta" "$TMPDIR/expected_with_name.fasta" || { echo "Files when using --name are different."; exit 1; }
|
||||
|
||||
# Create expected bed file for --name_only
|
||||
cat > "$TMPDIR/expected_with_name_only.fasta" <<EOF
|
||||
>myseq
|
||||
AAACC
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--input_bed "$TMPDIR/test.bed" \
|
||||
--input_fasta "$TMPDIR/test.fa" \
|
||||
--name_only \
|
||||
--output "$TMPDIR/output_with_name_only.fasta"
|
||||
|
||||
cmp --silent "$TMPDIR/output_with_name_only.fasta" "$TMPDIR/expected_with_name_only.fasta" || { echo "Files when using --name_only are different."; exit 1; }
|
||||
|
||||
|
||||
# Create expected tab-delimited file for --tab
|
||||
cat > "$TMPDIR/expected_tab.out" <<EOF
|
||||
myseq${TAB}AAACC
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--input_bed "$TMPDIR/test.bed" \
|
||||
--input_fasta "$TMPDIR/test.fa" \
|
||||
--name_only \
|
||||
--tab \
|
||||
--output "$TMPDIR/tab.out"
|
||||
|
||||
cmp --silent "$TMPDIR/expected_tab.out" "$TMPDIR/tab.out" || { echo "Files when using --tab are different."; exit 1; }
|
||||
|
||||
|
||||
# Create expected tab-delimited file for --bed_out
|
||||
cat > "$TMPDIR/expected.bed" <<EOF
|
||||
chr1${TAB}5${TAB}10${TAB}myseq${TAB}AAACC
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--input_bed "$TMPDIR/test.bed" \
|
||||
--input_fasta "$TMPDIR/test.fa" \
|
||||
--bed_out \
|
||||
--output "$TMPDIR/output.bed"
|
||||
|
||||
|
||||
cmp --silent "$TMPDIR/expected.bed" "$TMPDIR/output.bed" || { echo "Files when using --bed_out are different."; exit 1; }
|
||||
|
||||
# Create dummy bed file for strandedness
|
||||
cat > "$TMPDIR/test_strandedness.bed" <<EOF
|
||||
chr1${TAB}20${TAB}25${TAB}forward${TAB}1${TAB}+
|
||||
chr1${TAB}20${TAB}25${TAB}reverse${TAB}1${TAB}-
|
||||
EOF
|
||||
|
||||
# Create expected tab-delimited file for --bed_out
|
||||
cat > "$TMPDIR/expected_strandedness.fasta" <<EOF
|
||||
>forward(+)
|
||||
CGCTA
|
||||
>reverse(-)
|
||||
TAGCG
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--input_bed "$TMPDIR/test_strandedness.bed" \
|
||||
--input_fasta "$TMPDIR/test.fa" \
|
||||
-s \
|
||||
--name_only \
|
||||
--output "$TMPDIR/output_strandedness.fasta"
|
||||
|
||||
|
||||
cmp --silent "$TMPDIR/expected_strandedness.fasta" "$TMPDIR/output_strandedness.fasta" || { echo "Files when using -s are different."; exit 1; }
|
||||
|
||||
50
src/busco/busco_download_datasets/config.vsh.yaml
Normal file
50
src/busco/busco_download_datasets/config.vsh.yaml
Normal file
@@ -0,0 +1,50 @@
|
||||
name: busco_download_datasets
|
||||
namespace: busco
|
||||
description: Downloads available busco datasets
|
||||
keywords: [lineage datasets]
|
||||
links:
|
||||
homepage: https://busco.ezlab.org/
|
||||
documentation: https://busco.ezlab.org/busco_userguide.html
|
||||
repository: https://gitlab.com/ezlab/busco
|
||||
references:
|
||||
doi: 10.1007/978-1-4939-9173-0_14
|
||||
license: MIT
|
||||
authors:
|
||||
- __merge__: /src/_authors/dorien_roosen.yaml
|
||||
roles: [ author, maintainer ]
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments:
|
||||
- name: --download
|
||||
type: string
|
||||
description: |
|
||||
Download dataset. Possible values are a specific dataset name, "all", "prokaryota", "eukaryota", or "virus".
|
||||
The full list of available datasets can be viewed [here](https://busco-data.ezlab.org/v5/data/lineages/) or by running the busco/busco_list_datasets component.
|
||||
required: true
|
||||
example: stramenopiles_odb10
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- name: --download_path
|
||||
direction: output
|
||||
type: file
|
||||
description: |
|
||||
Local filepath for storing BUSCO dataset downloads
|
||||
required: false
|
||||
default: busco_downloads
|
||||
example: busco_downloads
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/busco:5.7.1--pyhdfd78af_0
|
||||
setup:
|
||||
- type: docker
|
||||
run: |
|
||||
busco --version | sed 's/BUSCO\s\(.*\)/busco: "\1"/' > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
14
src/busco/busco_download_datasets/script.sh
Normal file
14
src/busco/busco_download_datasets/script.sh
Normal file
@@ -0,0 +1,14 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
|
||||
if [ ! -d "$par_download_path" ]; then
|
||||
mkdir -p "$par_download_path"
|
||||
fi
|
||||
|
||||
busco \
|
||||
--download_path "$par_download_path" \
|
||||
--download "$par_download"
|
||||
|
||||
15
src/busco/busco_download_datasets/test.sh
Normal file
15
src/busco/busco_download_datasets/test.sh
Normal file
@@ -0,0 +1,15 @@
|
||||
echo "> Downloading busco stramenopiles_odb10 dataset"
|
||||
|
||||
"$meta_executable" \
|
||||
--download stramenopiles_odb10 \
|
||||
--download_path downloads
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "downloads/file_versions.tsv" ] && echo "file_versions.tsv does not exist" && exit 1
|
||||
[ ! -f "downloads/lineages/stramenopiles_odb10/dataset.cfg" ] && echo "dataset.cfg does not exist" && exit 1
|
||||
|
||||
echo ">> Checking if output is empty"
|
||||
[ ! -s "downloads/file_versions.tsv" ] && echo "file_versions.tsv is empty" && exit 1
|
||||
[ ! -s "downloads/lineages/stramenopiles_odb10/dataset.cfg" ] && echo "dataset.cfg is empty" && exit 1
|
||||
|
||||
rm -r downloads
|
||||
42
src/busco/busco_list_datasets/config.vsh.yaml
Normal file
42
src/busco/busco_list_datasets/config.vsh.yaml
Normal file
@@ -0,0 +1,42 @@
|
||||
name: busco_list_datasets
|
||||
namespace: busco
|
||||
description: Lists the available busco datasets
|
||||
keywords: [lineage datasets]
|
||||
links:
|
||||
homepage: https://busco.ezlab.org/
|
||||
documentation: https://busco.ezlab.org/busco_userguide.html
|
||||
repository: https://gitlab.com/ezlab/busco
|
||||
references:
|
||||
doi: 10.1007/978-1-4939-9173-0_14
|
||||
license: MIT
|
||||
authors:
|
||||
- __merge__: /src/_authors/dorien_roosen.yaml
|
||||
roles: [ author, maintainer ]
|
||||
argument_groups:
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- name: --output
|
||||
alternatives: ["-o"]
|
||||
direction: output
|
||||
type: file
|
||||
description: |
|
||||
Output file of the available busco datasets
|
||||
required: false
|
||||
default: busco_dataset_list.txt
|
||||
example: file.txt
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/busco:5.7.1--pyhdfd78af_0
|
||||
setup:
|
||||
- type: docker
|
||||
run: |
|
||||
busco --version | sed 's/BUSCO\s\(.*\)/busco: "\1"/' > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
6
src/busco/busco_list_datasets/script.sh
Normal file
6
src/busco/busco_list_datasets/script.sh
Normal file
@@ -0,0 +1,6 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
busco --list-datasets | awk '/^#{40}/{flag=1; next} flag{print}' > $par_output
|
||||
15
src/busco/busco_list_datasets/test.sh
Normal file
15
src/busco/busco_list_datasets/test.sh
Normal file
@@ -0,0 +1,15 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
"$meta_executable" \
|
||||
--output datasets.txt
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "datasets.txt" ] && echo "datasets.txt does not exist" && exit 1
|
||||
|
||||
echo ">> Checking if output is empty"
|
||||
[ ! -s "datasets.txt" ] && echo "datasets.txt is empty" && exit 1
|
||||
|
||||
rm datasets.txt
|
||||
221
src/busco/busco_run/config.vsh.yaml
Normal file
221
src/busco/busco_run/config.vsh.yaml
Normal file
@@ -0,0 +1,221 @@
|
||||
name: busco_run
|
||||
namespace: busco
|
||||
description: Assessment of genome assembly and annotation completeness with single copy orthologs
|
||||
keywords: [Genome assembly, quality control]
|
||||
links:
|
||||
homepage: https://busco.ezlab.org/
|
||||
documentation: https://busco.ezlab.org/busco_userguide.html
|
||||
repository: https://gitlab.com/ezlab/busco
|
||||
references:
|
||||
doi: 10.1007/978-1-4939-9173-0_14
|
||||
license: MIT
|
||||
authors:
|
||||
- __merge__: /src/_authors/dorien_roosen.yaml
|
||||
roles: [ author, maintainer ]
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments:
|
||||
- name: --input
|
||||
alternatives: ["-i"]
|
||||
type: file
|
||||
description: |
|
||||
Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set. Also possible to use a path to a directory containing multiple input files.
|
||||
required: true
|
||||
example: file.fasta
|
||||
- name: --mode
|
||||
alternatives: ["-m"]
|
||||
type: string
|
||||
choices: [genome, geno, transcriptome, tran, proteins, prot]
|
||||
required: true
|
||||
description: |
|
||||
Specify which BUSCO analysis mode to run. There are three valid modes:
|
||||
- geno or genome, for genome assemblies (DNA)
|
||||
- tran or transcriptome, for transcriptome assemblies (DNA)
|
||||
- prot or proteins, for annotated gene sets (protein)
|
||||
example: proteins
|
||||
- name: --lineage_dataset
|
||||
alternatives: ["-l"]
|
||||
type: string
|
||||
required: false
|
||||
description: |
|
||||
Specify a BUSCO lineage dataset that is most closely related to the assembly or gene set being assessed.
|
||||
The full list of available datasets can be viewed [here](https://busco-data.ezlab.org/v5/data/lineages/) or by running the busco/busco_list_datasets component.
|
||||
When unsure, the "--auto_lineage" flag can be set to automatically find the optimal lineage path.
|
||||
BUSCO will automatically download the requested dataset if it is not already present in the download folder.
|
||||
You can optionally provide a path to a local dataset instead of a name, e.g. path/to/dataset.
|
||||
Datasets can be downloaded using the busco/busco_download_dataset component.
|
||||
example: stramenopiles_odb10
|
||||
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- name: --short_summary_json
|
||||
required: false
|
||||
direction: output
|
||||
type: file
|
||||
example: short_summary.json
|
||||
description: |
|
||||
Output file for short summary in JSON format.
|
||||
- name: --short_summary_txt
|
||||
required: false
|
||||
direction: output
|
||||
type: file
|
||||
example: short_summary.txt
|
||||
description: |
|
||||
Output file for short summary in TXT format.
|
||||
- name: --full_table
|
||||
required: false
|
||||
direction: output
|
||||
type: file
|
||||
example: full_table.tsv
|
||||
description: |
|
||||
Full table output in TSV format.
|
||||
- name: --missing_busco_list
|
||||
required: false
|
||||
direction: output
|
||||
type: file
|
||||
example: missing_busco_list.tsv
|
||||
description: |
|
||||
Missing list output in TSV format.
|
||||
- name: --output_dir
|
||||
required: false
|
||||
direction: output
|
||||
type: file
|
||||
example: output_dir/
|
||||
description: |
|
||||
The full output directory, if so desired.
|
||||
|
||||
- name: Resource and Run Settings
|
||||
arguments:
|
||||
- name: --force
|
||||
type: boolean_true
|
||||
description: |
|
||||
Force rewriting of existing files. Must be used when output files with the provided name already exist.
|
||||
- name: --quiet
|
||||
alternatives: ["-q"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Disable the info logs, displays only errors.
|
||||
- name: --restart
|
||||
alternatives: ["-r"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Continue a run that had already partially completed. Restarting skips calls to tools that have completed but performs all pre- and post-processing steps.
|
||||
- name: --tar
|
||||
type: boolean_true
|
||||
description: |
|
||||
Compress some subdirectories with many files to save space.
|
||||
|
||||
- name: Lineage Dataset Settings
|
||||
arguments:
|
||||
- name: --auto_lineage
|
||||
type: boolean_true
|
||||
description: |
|
||||
Run auto-lineage pipelilne to automatically determine BUSCO lineage dataset that is most closely related to the assembly or gene set being assessed.
|
||||
- name: --auto_lineage_euk
|
||||
type: boolean_true
|
||||
description: |
|
||||
Run auto-placement just on eukaryota tree to find optimal lineage path.
|
||||
- name: --auto_lineage_prok
|
||||
type: boolean_true
|
||||
description: |
|
||||
Run auto_lineage just on prokaryota trees to find optimum lineage path.
|
||||
- name: --datasets_version
|
||||
type: string
|
||||
required: false
|
||||
description: |
|
||||
Specify the version of BUSCO datasets
|
||||
example: odb10
|
||||
|
||||
- name: Augustus Settings
|
||||
arguments:
|
||||
- name: --augustus
|
||||
type: boolean_true
|
||||
description: |
|
||||
Use augustus gene predictor for eukaryote runs.
|
||||
- name: --augustus_parameters
|
||||
type: string
|
||||
required: false
|
||||
description: |
|
||||
Additional parameters to be passed to Augustus (see Augustus documentation: https://github.com/Gaius-Augustus/Augustus/blob/master/docs/RUNNING-AUGUSTUS.md).
|
||||
Parameters should be contained within a single string, without whitespace and seperated by commas.
|
||||
example: "--PARAM1=VALUE1,--PARAM2=VALUE2"
|
||||
- name: --augustus_species
|
||||
type: string
|
||||
required: false
|
||||
description: |
|
||||
Specify the augustus species
|
||||
- name: --long
|
||||
type: boolean_true
|
||||
description: |
|
||||
Optimize Augustus self-training mode. This adds considerably to the run time, but can improve results for some non-model organisms.
|
||||
|
||||
- name: BBTools Settings
|
||||
arguments:
|
||||
- name: --contig_break
|
||||
type: integer
|
||||
required: false
|
||||
description: |
|
||||
Number of contiguous Ns to signify a break between contigs in BBTools analysis.
|
||||
- name: --limit
|
||||
type: integer
|
||||
required: false
|
||||
description: |
|
||||
Number of candidate regions (contig or transcript) from the BLAST output to consider per BUSCO.
|
||||
This option is only effective in pipelines using BLAST, i.e. the genome pipeline (see --augustus) or the prokaryota transcriptome pipeline.
|
||||
- name: --scaffold_composition
|
||||
type: boolean_true
|
||||
description: |
|
||||
Writes ACGTN content per scaffold to a file scaffold_composition.txt.
|
||||
|
||||
- name: BLAST Settings
|
||||
arguments:
|
||||
- name: --e_value
|
||||
type: double
|
||||
required: false
|
||||
description: |
|
||||
E-value cutoff for BLAST searches.
|
||||
|
||||
- name: Protein Gene Prediction settings
|
||||
arguments:
|
||||
- name: --miniprot
|
||||
type: boolean_true
|
||||
description: |
|
||||
Use Miniprot gene predictor.
|
||||
|
||||
- name: MetaEuk Settings
|
||||
arguments:
|
||||
- name: --metaeuk
|
||||
type: boolean_true
|
||||
description: |
|
||||
Use Metaeuk gene predictor.
|
||||
- name: --metaeuk_parameters
|
||||
type: string
|
||||
description: |
|
||||
Pass additional arguments to Metaeuk for the first run (see Metaeuk documentation https://github.com/soedinglab/metaeuk).
|
||||
All parameters should be contained within a single string with no white space, with each parameter separated by a comma.
|
||||
example: "--max-overlap=15,--min-exon-aa=15"
|
||||
- name: --metaeuk_rerun_parameters
|
||||
type: string
|
||||
description: |
|
||||
Pass additional arguments to Metaeuk for the second run (see Metaeuk documentation https://github.com/soedinglab/metaeuk).
|
||||
All parameters should be contained within a single string with no white space, with each parameter separated by a comma.
|
||||
example: "--max-overlap=15,--min-exon-aa=15"
|
||||
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- type: file
|
||||
path: test_data
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/busco:5.7.1--pyhdfd78af_0
|
||||
setup:
|
||||
- type: docker
|
||||
run: |
|
||||
busco --version | sed 's/BUSCO\s\(.*\)/busco: "\1"/' > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
63
src/busco/busco_run/help.txt
Normal file
63
src/busco/busco_run/help.txt
Normal file
@@ -0,0 +1,63 @@
|
||||
```bash
|
||||
busco -h
|
||||
```
|
||||
|
||||
usage: busco -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]
|
||||
|
||||
Welcome to BUSCO 5.7.1: the Benchmarking Universal Single-Copy Ortholog assessment tool.
|
||||
For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO
|
||||
|
||||
optional arguments:
|
||||
-i SEQUENCE_FILE, --in SEQUENCE_FILE
|
||||
Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set. Also possible to use a path to a directory containing multiple input files.
|
||||
-o OUTPUT, --out OUTPUT
|
||||
Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. The path to the output folder is set with --out_path.
|
||||
-m MODE, --mode MODE Specify which BUSCO analysis mode to run.
|
||||
There are three valid modes:
|
||||
- geno or genome, for genome assemblies (DNA)
|
||||
- tran or transcriptome, for transcriptome assemblies (DNA)
|
||||
- prot or proteins, for annotated gene sets (protein)
|
||||
-l LINEAGE, --lineage_dataset LINEAGE
|
||||
Specify the name of the BUSCO lineage to be used.
|
||||
--augustus Use augustus gene predictor for eukaryote runs
|
||||
--augustus_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
|
||||
Pass additional arguments to Augustus. All arguments should be contained within a single string with no white space, with each argument separated by a comma.
|
||||
--augustus_species AUGUSTUS_SPECIES
|
||||
Specify a species for Augustus training.
|
||||
--auto-lineage Run auto-lineage to find optimum lineage path
|
||||
--auto-lineage-euk Run auto-placement just on eukaryote tree to find optimum lineage path
|
||||
--auto-lineage-prok Run auto-lineage just on non-eukaryote trees to find optimum lineage path
|
||||
-c N, --cpu N Specify the number (N=integer) of threads/cores to use.
|
||||
--config CONFIG_FILE Provide a config file
|
||||
--contig_break n Number of contiguous Ns to signify a break between contigs. Default is n=10.
|
||||
--datasets_version DATASETS_VERSION
|
||||
Specify the version of BUSCO datasets, e.g. odb10
|
||||
--download [dataset [dataset ...]]
|
||||
Download dataset. Possible values are a specific dataset name, "all", "prokaryota", "eukaryota", or "virus". If used together with other command line arguments, make sure to place this last.
|
||||
--download_base_url DOWNLOAD_BASE_URL
|
||||
Set the url to the remote BUSCO dataset location
|
||||
--download_path DOWNLOAD_PATH
|
||||
Specify local filepath for storing BUSCO dataset downloads
|
||||
-e N, --evalue N E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03)
|
||||
-f, --force Force rewriting of existing files. Must be used when output files with the provided name already exist.
|
||||
-h, --help Show this help message and exit
|
||||
--limit N How many candidate regions (contig or transcript) to consider per BUSCO (default: 3)
|
||||
--list-datasets Print the list of available BUSCO datasets
|
||||
--long Optimization Augustus self-training mode (Default: Off); adds considerably to the run time, but can improve results for some non-model organisms
|
||||
--metaeuk Use Metaeuk gene predictor
|
||||
--metaeuk_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
|
||||
Pass additional arguments to Metaeuk for the first run. All arguments should be contained within a single string with no white space, with each argument separated by a comma.
|
||||
--metaeuk_rerun_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
|
||||
Pass additional arguments to Metaeuk for the second run. All arguments should be contained within a single string with no white space, with each argument separated by a comma.
|
||||
--miniprot Use Miniprot gene predictor
|
||||
--skip_bbtools Skip BBTools for assembly statistics
|
||||
--offline To indicate that BUSCO cannot attempt to download files
|
||||
--opt-out-run-stats Opt out of data collection. Information on the data collected is available in the user guide.
|
||||
--out_path OUTPUT_PATH
|
||||
Optional location for results folder, excluding results folder name. Default is current working directory.
|
||||
-q, --quiet Disable the info logs, displays only errors
|
||||
-r, --restart Continue a run that had already partially completed.
|
||||
--scaffold_composition
|
||||
Writes ACGTN content per scaffold to a file scaffold_composition.txt
|
||||
--tar Compress some subdirectories with many files to save space
|
||||
-v, --version Show this version and exit
|
||||
72
src/busco/busco_run/script.sh
Normal file
72
src/busco/busco_run/script.sh
Normal file
@@ -0,0 +1,72 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
|
||||
[[ "$par_tar" == "false" ]] && unset par_tar
|
||||
[[ "$par_force" == "false" ]] && unset par_force
|
||||
[[ "$par_quiet" == "false" ]] && unset par_quiet
|
||||
[[ "$par_restart" == "false" ]] && unset par_restart
|
||||
[[ "$par_auto_lineage" == "false" ]] && unset par_auto_lineage
|
||||
[[ "$par_auto_lineage_euk" == "false" ]] && unset par_auto_lineage_euk
|
||||
[[ "$par_auto_lineage_prok" == "false" ]] && unset par_auto_lineage_prok
|
||||
[[ "$par_augustus" == "false" ]] && unset par_augustus
|
||||
[[ "$par_long" == "false" ]] && unset par_long
|
||||
[[ "$par_scaffold_composition" == "false" ]] && unset par_scaffold_composition
|
||||
[[ "$par_miniprot" == "false" ]] && unset par_miniprot
|
||||
|
||||
tmp_dir=$(mktemp -d -p "$meta_temp_dir" busco_XXXXXXXXX)
|
||||
prefix=$(openssl rand -hex 8)
|
||||
|
||||
busco \
|
||||
--in "$par_input" \
|
||||
--mode "$par_mode" \
|
||||
--out "$prefix" \
|
||||
--out_path "$tmp_dir" \
|
||||
--opt-out-run-stats \
|
||||
${meta_cpus:+--cpu "${meta_cpus}"} \
|
||||
${par_lineage_dataset:+--lineage_dataset "$par_lineage_dataset"} \
|
||||
${par_augustus:+--augustus} \
|
||||
${par_augustus_parameters:+--augustus_parameters "$par_augustus_parameters"} \
|
||||
${par_augustus_species:+--augustus_species "$par_augustus_species"} \
|
||||
${par_auto_lineage:+--auto-lineage} \
|
||||
${par_auto_lineage_euk:+--auto-lineage-euk} \
|
||||
${par_auto_lineage_prok:+--auto-lineage-prok} \
|
||||
${par_contig_break:+--contig_break $par_contig_break} \
|
||||
${par_datasets_version:+--datasets_version "$par_datasets_version"} \
|
||||
${par_e_value:+--evalue "$par_e_value"} \
|
||||
${par_force:+--force} \
|
||||
${par_limit:+--limit "$par_limit"} \
|
||||
${par_long:+--long} \
|
||||
${par_metaeuk:+--metaeuk} \
|
||||
${par_metaeuk_parameters:+--metaeuk_parameters "$par_metaeuk_parameters"} \
|
||||
${par_metaeuk_rerun_parameters:+--metaeuk_rerun_parameters "$par_metaeuk_rerun_parameters"} \
|
||||
${par_miniprot:+--miniprot} \
|
||||
${par_quiet:+--quiet} \
|
||||
${par_restart:+--restart} \
|
||||
${par_scaffold_composition:+--scaffold_composition} \
|
||||
${par_tar:+--tar} \
|
||||
|
||||
|
||||
out_dir=$(find "$tmp_dir/$prefix" -maxdepth 1 -name 'run_*')
|
||||
|
||||
if [[ -n "$par_short_summary_json" ]]; then
|
||||
cp "$out_dir/short_summary.json" "$par_short_summary_json"
|
||||
fi
|
||||
if [[ -n "$par_short_summary_txt" ]]; then
|
||||
cp "$out_dir/short_summary.txt" "$par_short_summary_txt"
|
||||
fi
|
||||
if [[ -n "$par_full_table" ]]; then
|
||||
cp "$out_dir/full_table.tsv" "$par_full_table"
|
||||
fi
|
||||
if [[ -n "$par_missing_busco_list" ]]; then
|
||||
cp "$out_dir/missing_busco_list.tsv" "$par_missing_busco_list"
|
||||
fi
|
||||
if [[ -n "$par_output_dir" ]]; then
|
||||
if [[ -d "$par_output_dir" ]]; then
|
||||
rm -r "$par_output_dir"
|
||||
fi
|
||||
cp -r -L "$out_dir" "$par_output_dir"
|
||||
fi
|
||||
|
||||
88
src/busco/busco_run/test.sh
Normal file
88
src/busco/busco_run/test.sh
Normal file
@@ -0,0 +1,88 @@
|
||||
test_dir="$meta_resources_dir/test_data"
|
||||
|
||||
mkdir "run_prot_stramenopiles"
|
||||
cd "run_prot_stramenopiles"
|
||||
|
||||
echo "> Running busco with lineage dataset"
|
||||
|
||||
"$meta_executable" \
|
||||
--input $test_dir/protein.fasta \
|
||||
--mode proteins \
|
||||
--lineage_dataset stramenopiles_odb10 \
|
||||
--output_dir output \
|
||||
--short_summary_json short_summary.json \
|
||||
--short_summary_txt short_summary.txt \
|
||||
--full_table full_table.tsv \
|
||||
--missing_busco_list missing_busco_list.tsv
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "output/full_table.tsv" ] && echo "full_table.tsv does not exist" && exit 1
|
||||
[ ! -f "output/missing_busco_list.tsv" ] && echo "missing_busco_list.tsv does not exist" && exit 1
|
||||
[ ! -f "output/short_summary.json" ] && echo "short_summary.json does not exist" && exit 1
|
||||
[ ! -f "output/short_summary.txt" ] && echo "short_summary.txt does not exist" && exit 1
|
||||
[ ! -f "full_table.tsv" ] && echo "full_table.tsv does not exist" && exit 1
|
||||
[ ! -f "missing_busco_list.tsv" ] && echo "missing_busco_list.tsv does not exist" && exit 1
|
||||
[ ! -f "short_summary.json" ] && echo "short_summary.json does not exist" && exit 1
|
||||
[ ! -f "short_summary.txt" ] && echo "short_summary.txt does not exist" && exit 1
|
||||
|
||||
echo ">> Checking if output is empty"
|
||||
[ ! -s "output/full_table.tsv" ] && echo "full_table.tsv is empty" && exit 1
|
||||
[ ! -s "output/missing_busco_list.tsv" ] && echo "missing_busco_list.tsv is empty" && exit 1
|
||||
[ ! -s "output/short_summary.json" ] && echo "short_summary.json is empty" && exit 1
|
||||
[ ! -s "output/short_summary.txt" ] && echo "short_summary.txt is empty" && exit 1
|
||||
[ ! -s "full_table.tsv" ] && echo "full_table.tsv is empty" && exit 1
|
||||
[ ! -s "missing_busco_list.tsv" ] && echo "missing_busco_list.tsv is empty" && exit 1
|
||||
[ ! -s "short_summary.json" ] && echo "short_summary.json is empty" && exit 1
|
||||
[ ! -s "short_summary.txt" ] && echo "short_summary.txt is empty" && exit 1
|
||||
|
||||
cd ..
|
||||
mkdir "run_prot_autolineage"
|
||||
cd "run_prot_autolineage"
|
||||
|
||||
echo "> Running busco with auto lineage"
|
||||
|
||||
"$meta_executable" \
|
||||
--input $test_dir/protein.fasta \
|
||||
--mode proteins \
|
||||
--auto_lineage \
|
||||
--output_dir output
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "output/full_table.tsv" ] && echo "full_table.tsv does not exist in output folder" && exit 1
|
||||
[ ! -f "output/missing_busco_list.tsv" ] && echo "missing_busco_list.tsv does not exist in output folder" && exit 1
|
||||
[ ! -f "output/short_summary.json" ] && echo "short_summary.json does not exist in output folder" && exit 1
|
||||
[ ! -f "output/short_summary.txt" ] && echo "short_summary.txt does not exist in output folder" && exit 1
|
||||
|
||||
echo ">> Checking if output is empty"
|
||||
[ ! -s "output/full_table.tsv" ] && echo "full_table.tsv in output folder is empty" && exit 1
|
||||
[ ! -s "output/missing_busco_list.tsv" ] && echo "missing_busco_list.tsv in output folder is empty" && exit 1
|
||||
[ ! -s "output/short_summary.json" ] && echo "short_summary.json in output folder is empty" && exit 1
|
||||
[ ! -s "output/short_summary.txt" ] && echo "short_summary.txt in output folder is empty" && exit 1
|
||||
|
||||
rm -r output/
|
||||
|
||||
cd ..
|
||||
mkdir "run_genome"
|
||||
cd "run_genome"
|
||||
|
||||
echo "> Running busco with genome data"
|
||||
|
||||
"$meta_executable" \
|
||||
--input $test_dir/genome.fna \
|
||||
--mode genome \
|
||||
--lineage_dataset saccharomycetes_odb10 \
|
||||
--output_dir output
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "output/full_table.tsv" ] && echo "full_table.tsv does not exist in output folder" && exit 1
|
||||
[ ! -f "output/missing_busco_list.tsv" ] && echo "missing_busco_list.tsv does not exist in output folder" && exit 1
|
||||
[ ! -f "output/short_summary.json" ] && echo "short_summary.json does not exist in output folder" && exit 1
|
||||
[ ! -f "output/short_summary.txt" ] && echo "short_summary.txt does not exist in output folder" && exit 1
|
||||
|
||||
echo ">> Checking if output is empty"
|
||||
[ ! -s "output/full_table.tsv" ] && echo "full_table.tsv in output folder is empty" && exit 1
|
||||
[ ! -s "output/missing_busco_list.tsv" ] && echo "missing_busco_list.tsv in output folder is empty" && exit 1
|
||||
[ ! -s "output/short_summary.json" ] && echo "short_summary.json in output folder is empty" && exit 1
|
||||
[ ! -s "output/short_summary.txt" ] && echo "short_summary.txt in output folder is empty" && exit 1
|
||||
|
||||
rm -r output/
|
||||
10000
src/busco/busco_run/test_data/genome.fna
Normal file
10000
src/busco/busco_run/test_data/genome.fna
Normal file
File diff suppressed because it is too large
Load Diff
64
src/busco/busco_run/test_data/protein.fasta
Normal file
64
src/busco/busco_run/test_data/protein.fasta
Normal file
@@ -0,0 +1,64 @@
|
||||
>341721at2759_1001832_1:000010
|
||||
MASRPVKKRKLTPPGDDEASSRKSGGKIQKAFLKNAANWDLEQDYETRARKGKKKEKESTRLPLKLPGGRVQHVSAPDNDFQAIESDEDWLDGAEDVSEDEESKDKKAPEEPEKPEHEQILEAKEELAKIALMLNESPDENTGAFKALAKIGQSRIITIKKLALATQLTVYKDVIPGYRIRPVAEDGPEEKLSKDVRKLRTYETCLISGYQAYVKELTKHAKTGHANGLASVAITCACNLLTAVPHFNFRSDLVKILVGKLSTRRVDDDFNKCLQALETLFEEDEEGRPSMEAVSLLSKMMKAREYQVNESVVNLFLHLRLLSDFSGKGSKDSVDRMDDGPSKKPKSKREFRTKRERKQIKEQKALQKDMAQADALVQHEERDRMEGETLKLVFGTYFRVLKMRVPHLMGAVLEGLSKYAHLINQNFFGDLLEALKDLIRHSDASEKDDAEEKEDEEADDDAPVRNPSREALLCTTTAFALLAGQDAHNARADLHLDLSFFTTHLYQSLFPLSLHPDLELGARSLHLPDPDKPSQNRKSNSSNKVNLQTTTVLLIRCLTAVLLPPWNVRSVPPVRLAAFAKQLMTAALHVPEKSAQALLALLADVAGTHGRRIAALWNTEERKGDGAFNPLAESAEASNPFAATVWEGEILRRHYCPAVRRGVGIVEKSLSLAER
|
||||
>296129at2759_1069680_1:000010
|
||||
MMKKKQIDSRIPTLIKNGVQEKKRTLFVIVGDRGRDQIVNLHWLLSQTRIASRPSVLWMYKKDLLGFTSHRKKREAKIKKEIKKGIRDPNEATTPFELFISVTNIRYTYYKESEKILGQTFGMLVLQDFEAITPNLLARTIETVEGGGIIVILFKTMENLKQLYTMTMDIHSRYRTEAHQDVVARFNGRFILSLGHCSSCLFVDDELNVLPISEAKKVKPLPKPQLEEPKKELEELKQKYEDKQLLRSLIDVAKTVDQARALITFVEAISEKTLRSTVALTAARGRGKSAALGLAISAAVAYGYSNIFITSPNPENLKTLFEFTFKGFNSLKYEEHIDYDIIQSLNPSFNKSIVRVNIFRNHRQTIQYIHPSDAYVLGQAELLVIDEAAAIPMPLVKKLLGPYLTFMASTVNGYEGTGRSLSLKLIQQLREQSRGFAHENTKSGNSEKSMINRSEKLNKESGINSIGGRKLREITLEEPIRYSYGDPVEEWLNKLLCLDINISLKQFLEQGCPHPSQCELYYVNRDTLFSYHPVSESFLQMMMSLYVASHYKNSPNDLQLMADAPAHQLFVLLPPVKEDDNKLPEPLCVIQVALEGEISRESVVNNLTRGYRTGGDLIPWVITEQFQDDKFASLSGARIVRIATNPEYIRMGYGSHALKLLENFYEGKYLNLSEETISESNENIKIINNNLESSLLTDDIKIKDLKIMPPLLLKLSEKKPGLIHYLGVSYGLTPQLYKFWKRAEFIPVYLRQTPNDLTGEHTCLMLKLLQDKSETWLNEFSNDFRKRFLSLLSFSFRSFPTILCLNIIESINNDLIQKDNVHVITKSEIDINLSPFDLKRLESYANNMLDYHTIIDMLPYIADLYFKGRFGKDLKMTGVQSAILLALGLQKRLLEDIEKELNLPSNQVLAMLVKILRKLSSFFKDIYYKAIDNTLPIERKNLKNQLQTHADENDNFRGFIPLKATLKEELDHLSSEMEDSIKEKQRELINSLDLQKYIIKGQEEDWDKAEQHIKNGIYSGKSSVVSIQSHSLKREHESLTDIPHIKKKHQKKHKRKV
|
||||
>1217666at2759_1073089_1:000010
|
||||
MPINQPSNQIKFTNVSVVRLKKGKKRFELACYKNKLLEYRSGAEKDLDNVLQVPTIFLSVSKAQTAPSAELTKAFGANIPADEIRQEILRKGEVQVGERERKEISERVEKELLDIVSGRLVDPTTKRVYTPGMISKALDQLSSASGQMQQTQGEGSGATDEKGAAQPRKPMWTGVAPNKSAKSQALDAMKALIAWQPIPVMRARMRLRVTCPVSILKHSVKAPSGGGASKEKEAPSGNSKSNKGKKGPKSRAARQQDSDAEDGKSDAEAAPKTPSNVKDKILGYIESIESQEVIGGDEWEVVGFAEPGAYKGLNEFVGNETRGRGRVEVLDMTVTHEE
|
||||
>513979at2759_1159556_1:000010
|
||||
MAVVDIQARFSPHHPLEPDLLYEIQSILRLHGLSVDDLFFKWDAYCIRMDLDAQAALSLANVRSLKQSIQDDLEKSHRSTTQVRSERKVAAAPKAVSGGDVYGMLDGLVPSTPAAGGKRSRGVAAGGGGSGLKKKMDSLKMNSSPAGMKEQLSAFNGLPATSFAERANAGDVVEILNAQLPPCEAPLAPFPEPRIKLTAASDQKKMAYKPLAVKLSEASEVLDDRIDEFAALVQDYHGLEDSAFGSAASQGTTEVVAVGRIASDAMEGKLNAAALVLETSRRTGMGLRVPLKMHKVPSWSFFPGQVVALRGTNATGGEFVVEQVLDVPLLPSAASTPSALEAHRARMSGVPPGGGAAAATTDSDAAAPAPAPAPLTILYAAGPYTADDNLDYEPLHALCSQAADALADALVLAGPFLDIDHPLVAAGDFDLPPEDEAALDPDTATMSAVFRHLVAPALNRACAANPHLTVVLVPSVRDVLARHVSWPQDAIARKELGLAKAARIVSNPMTLSMNEVVVGVSSQDVLHELRNEECSRACPPGDLMGRLCRYLVEQRHYFPLFPPTDRARLPRTGTQSGLATGAVLDPSYLRLGEMVNVRPDVMVVPSSLPPFAKASSVVESVLAINPGPLSKRKGAGTFARMTLHAPPVGGGSEMTSHRVFDRARVEIVRI
|
||||
>543764at2759_1165861_1:000010
|
||||
MALGRAARPVGWTDCCAAVEKKPNYKSGMTQPARTITAGDNLLLKLPSGQTRTIKNVTSDSSISLGKFGKFQTNELIDQPFGLTFDILEDGKLVRNEQINLALELNPMLDELNSFESIKGMANGISNVEDIEATNEMIKESDGAQKLTNVEIEELKKSGLSGREIILRQIQQHSAFELKSEFSKAKYIKRKEKKFLKMFTCIDPTIHNMSQYLFENHNFAIKGLRPDTLSQMLSLSNVRPGWKGIVVDDIGGLLVAAVLIRMGGEGTIFVLNNADSPPDLHLLELFNLPKSVLGPLKSLNWAQTEADWTTSDIEELLLLHRDPPQPLPILDSTLPDPQLKQLSQRTKKQPNNRSKSMRKFERVQELLSMRQEFLDTQFEGLLTCSEYEPESIVTKLVNKLSGSSTIVIYSCHLRPLSDLQTLLKKSSMPSTSSSSLGGSSSLVEQNELTKRMKENKTEFIQITISEPWLRAYQVLVGRTHPEMAGTHHGGFVFSAIKVFNSCS
|
||||
>1558822at2759_1266660_1:000010
|
||||
MSIAEILPLEIIDKTVGQPVLVMLTSHREFSGTLVGYDDFVNVVLEEVVEYDHDQEIKRHAGKMLLSGNNIAMLVPGGKRVQ
|
||||
>1287094at2759_1291522_0:000010
|
||||
MGNILVKKNRVTITEADRAILTLRTQRRKMEEHRRRVEALMERETTVARTLVAKQQRPAALLALKKKRLHETQLEGLDNCLLTLEETLTQVESAQRTARLMAALKQGADVLSALQRAMPLESVEQLMEQGAESREYEMRLQALLGESLGEDQSAAAERELDEMEAQLIEEDVLDLPKVPSHAVARPASARAIGQAASERQLEPEIAA
|
||||
>83779at2759_1296121_1:000010
|
||||
MCGLTLTIRPLSLSLSSPSVSDCSSSDSTEDADLALLDSFRSTNAQRGPDSQRTFKHTVTLDDDDNGVTTTTTTKSTTKSKVEICLTATVLGLRGDLTAQPLVGNRGVLGWNGQVFEGIDIGTEENDTRKIFERLEKGERVEDVLSGVEGPFAFIYLDLENDILHYQLDPLSRRSLLIHPAEVAVDSNPSVTRHFILSSSRSTLAREHGVDMRALLGGEGGTIDLRRIKVVQNQGFLTMDMSDALKHRHTLSPDQDASCSSSSGSWTKVAPINTALPPDNLPLDNPKIKEEVPKFIEQLKESVKRRVENIPNPEKGCSRVAVLFSGGIDCTFLAYLIHLCLPPEDPIDLINVAFSPAPKLSSLSSNGADKGKGKSPALPAAPTYDVPDRLSGRDALVELKQVCPDREWRFVEIDVPYDEARAHRQNVLDLMYPSSTEMDHSLALPLYFASRGYGSVRKEGSNHSEPYRVKAKVYISGLGADEQLGGYARHRHAYQREGWQGLISETQMDIARLPTRNLSRDDRMLSSHARDARYPYLSLSFISYLSSLPVHLKCDPRLGEGQGDKILLRKAVESVGLVRASGRVKRAMQFGTRSSKLGGRGSGVKGPKAGERQVE
|
||||
>1057950at2759_1314783_1:000010
|
||||
MSSRQATHADSWYVGDGRRLDSELSKNLAAVEGDANYSPPIKGCKAVIAPHAGYSYSGRAAAWAYKSIDTTGIKRIFILGPSHHVYLDGCALSKCEKYETPLGELPIDLDTVKELRATGEFQDMDIQTDEDEHSIEMHLPYVRKVFEGLDIAIVPILIGAINLNKENKFGTVLAPYLAKDDTFFVISSDFCHWGTRFQYTFYYPRPPPTSTPAIRLSKADPNPSTLATHPIHASISAIDHEAMDLMTMPPQTAQQAHIDFAEYLRTTKNTICGRHPIGVLLGALAVLQSQGRVPHLKFVRYEQSSQCQTVRDSSVSYASAYITV
|
||||
>453044at2759_1330018_1:000010
|
||||
MPAAPQDPFFKSIGSAAADTEALREQPDEQDEQETDLEPIDEDRPLQEVESLCMSCGEQGVTRMLLTSIPYFREVIVMSFRCEHCGNQNNEIQSASTIREHGAMYTVKILNQGDLNRQLVKSEAATVTIPEFELTIPPLRGQLTTVEGTLRDTIQDLAADQPLRRIQDPPTFDKIEALLAKLKEVVPDDEDEAAPTMKERHPEDPVRPFTVILDDPTGNSFIEFSGSMSDPKWSLREYARSMDQNITLGLSQPEDEEKEKVTQKGGPFTEEDEDGLPAEEVFIFPGICSSCGHPVDTRMKKVNIPYFKDIIIMSTNCSACGYRDNEVKSGGAISDKGKRITLKVEDAEDLSRDILKSETCGLEIPEIDLALHAGTLGGRFTTVEGILTQVYDELSEKVFRGDSVGSANSKDNQEFETFLGSMKEVMTAARPFTLILDDPLANSYLQNLYAPDPDPNMEIVTYDRTFDQNEDLGLNDMKVEGYEAPS
|
||||
>1323575at2759_1392248_1:000010
|
||||
MSQPQPPPLRYIRYEPSREDEYVAAMRQLISKDLSEPYSIYVYRYFLYQWGDLCFMTVDDSRPEDPIVGVVVSKLEPHRGGPMRGYIAMLAVREEYRGRGIATKLVRMAIDAMIARDADEIALETEITNTAAMKLYERLGFLRSKRLHRYYLNGNSAYRLVLYLKEGVGNMRTSFDPYAAPAEARPEMSGAAAVPAAPAPPPLLQGNGR
|
||||
>160593at2759_139723_0:000010
|
||||
MADAELAKALKDLPNRVLNVPVEERPELFQNVIAVLPNPGINATIVRGICKVIGTTLTKYKDPESQTLVKELLVAVLKQHPDLTYEHFNAVLKALLAKDLAGAPPIKAAQASALALGWANLIALHADHETAVGKKEFPKLLEVQAGLYQLSLTSGIQKISDKAYSFLRDFFASDESLAQRYFDKLLAMEPSSGVIVMLCTIVRYLHQEQGTVELLDQHKPKLLDHLVKGLITVKTKPHASDIVACSILLKAITKDELRTIIVPALQRSMLRSAEVILRAVGAIVNEIELDVSDYALDLGKPLVQNLASKEETVRQEAVESLKQVALKCGTPNAIETLLKEVFAVLNGSGGKITVAELRINLLQGAGNLSYNKIPSQKIQTILPAACDHFTKVIEAEIQEKVVCHALEMFGLWTVNHRGEIPAKIVQLFKKGLDAKAQTIRTSYLQWFLSCLHDGKLPNGIDFTTTLSKIVERAAQSPTQTPVVSEGVGAACILLLTNPSVSEKLKDFWNIVLDTNKSPFLSERFLSTTNAETRCYVMVICEQLLIKHRNELKGSSTTDPLIRAATVCVMSAQAKVRRYCLPLVTKIVNSEDGVSLAKFLLAELTRYVECTKILSEGEPAEEGIAPAQALVDAVCTVCNVEKVANPDAQSLALSALLCSHHPAAVSVRGDLWESILERYGLYGKQFIALNTAQIEEVFFNSYKATAMYENTLATLSRISPELILSVLVKNVTDQLNNSRMSNVTDEEYFTYLTPDGELYDKSVIPNTDEQVQTAHLKRENKAYSYKEQLEELQLRRELEEKRRKEGKWKPPQLTPKQKEVIDKQREKENAIKARLQALHDTITTLISQIEGAAKGTPKQLPLFFPALLPAILRVFSSPLAAPAMVKLYYRLKDICFGEERVELGRDIAIATIRLSKPHCDLEESWCTANLVELVSDILVALYDETIDMYNVHREEEASKRYLLDAPAFSYTFEFLKRALTLPEAKKDESLLINGVQIIAYHAQLKGDTVDGKDLGDVYHPLYMPRLEMIRLLLRLIQQHRGRVQTQAVAALLDVAESCSGREYTTRAEQREIEALLVALQEELDAVRDVALRALAIMIDVLPSIADDYEFGLRLTRRLWVAKHDLSADIKQLATGIWQDGAYEVPIVMADELMKDIIHPELCVQKAAAAALVSILVEDSSTIDGVVEQLLEIYREKVVMIPAKLDQFDREVEPAIDPWGPRRGVAITLGSISPFLTPELVKSVIQFMVRSGLRDRQEIVHKEMLAASLAIVEHHGKDSVTYLLPTFEYFLDKAPSKGAYDNIRQAVVILMGSLARHLDREDERIQPIIDRLLAALETPSQQVQEAVANCIPHLIPSVKDKAPEIVKKLLQQLVKSEKYGVRRGAAYGIAGVVKGLGILSLKQLDIMSKLTHYIQDKKNYKSREGALFAFEMLCSTLGRLFEPYIVHVLPHLLQCFGDSSVYVRQAADECAKTVMAKLSAHGVKLVLPSLLNALDEDSWRTKTASVELLGSMAFCAPKQLSSCLPSIVPKLMEVLGDSHIKVQEAGANALRVIGSVIKNPEIQAIVPVLLTALEDPSSKTSACLQSLLETKFVHFIDAPSLALIMPVVQRAFMDRSTETRKMAAQIIGNMYSLTDQKDLTPYLPNIIPGLKTSLLDPVPEVRGVSARALGAMVRGMGESSFEDLLPWLMQTLTSESSSVDRSGAAQGLSEVVGGLGVEKLHKLMPEIIATAERTDIAPHVKDGYIMMFIYMPSAFPNDFTPYIGQIINPILKALADENEYVRDTALKAGQRIVNLYAESAITLLLPELEKGLFDDNWRIRYSSVQLLGDLLYKISGVSGKMTTQTASEDDNFGTEQSHKAIIRSLGADRRNRVLAGLYMGRSDVSLMVRQAALHVWKVVVTNTPRTLREILPTLFSLLLGCLASTSYDKRQVAARTLGDLVRKLGERVLPEIIPILERGLSSDQADQRQGVCIGLSEIMASTSRDMVLTFVNSLVPTVRKALADPLPEVRHAAAKTFDSLHTTVGARALEDILPSMLESLADPDPDVAEWTLDGLRQVMAIKSRVVLPYLIPQLTAKPVNTKALSILASVAGEALTKYLPKILPALLAALAAAQGTPEEVQELEYCQAVILSVSDEVGIRTIMDTVMESTKSEIPETRRAAATLLCAFCTHSPGDYSQYVPQLLRGLLWLLSDGDREVLQRSWDALNAVTKTLDSAQQIAHVTDVRQAVKFASSDLPKGGELPGFCLPKGITPLLPVFREAILNGLPEEKENAAQGLGEVIKLTSPASLQPSVVHITGPLIRILGDRFNAGVKAAVLETLAILLHKVGIMLKQFLPQLQTTFLKALHDPSRTVRIKAGHALAELIVIHTRPDPLFVEMHNGIKSADDSAVRETMLQALRGIVTPAGDKMTEPLRKQIYATLAGMLAHPEDVSRAAAAGCFGALCRWLTPEQVDDALTSHMLNEDYGDDATLRHGRTAALFVALKEHPGGIVTTKYEPKICKVITGALVSDKISVAMNGVRAGGYLLQYGMTDGTAKLSTAVIGPFVKSMNHSSNEVKQLLAKTCTYLARVVPAERIAPEYLKLAIPMLVNGTKEKNGYVRSNSEIALVHVLRLRDGEEFHQRCITLLEPGARESLSEVVSKVLRKVAMQAVGKEEELDDTILT
|
||||
>1346432at2759_1447883_1:000010
|
||||
MSSMRNAVQRRVHRERAQPANREKWGILEKHKDYSLRARDYSVKKAKLQRLREKADTRNPDEFAFGMMSGKSRTQGKHGARDTESAALSLETVKLLKTQDAGYLRVVGERIRRQMMAVDEEVRVQEGISGVSANGAAAGGGGGGGRKVVFVDSVEEQRERALEDEGKSDDDEEQGDFDEVDEEEQRQQKTQPKSKKQLEAEKLAQKEMLKARKLKIKAAEARSKKLQALTDQHKNIVAAEQELDWQRGKMENSVGGVNKHGLRWKVRERKR
|
||||
>761109at2759_198730_1:000010
|
||||
MAMTFTEDSIKELRLRLEDAVVKCSERCLYQSAKWAAEMLNSLVSTDGNDTDAESPMETDLQPTVNPFSLQSDPTEATLELQEAHKYLLAKSYFDTREYDRCAAVFLPPTIPPVPLSTVSPNVKSRASLTPQKGKRKSFIRPGLKSGQALPRNPYPNLSQKSLFLALYAKYLAGEKRRDEETEMVLGPADGGMTVNRELPDLARGLEGWFEERRERGLQDQGQGWLEYLYAVILIKGKNEEEAKKWLVRSVHLFPFHWGAWQELNDLLPSVDDLKQVAETLPQNIMSFIFQVHCSQELYQATDETHQTLNGLESIFPTSAFLKTERALLYYHSRDFEDASAIFADILIDSPHRLDSLDHYSNILYVMGARPQLAFVAQLATATDKFRPETCCVVGNYYSLKSEHEKAVMYFRRALTLDRNFLSAWTLMGHEYIEMKNTHAAIESYRRAVDVNRKDYRAWYGLGQAYEVLDMCFYALYYYQRTAALKPYDPKMWQAVGTCYAKMNQIPQSIKAMKRALVAGAYYEQRADAATADHPAAGRKILDPDLLHQIALLYEKMNNEDEAAAYMELTLQQESGEIERTETDSDDDDGDDNSDDGTTQRRSRRQRRRQKSRDDDNEIEAVGGTGVTATTSKARLWLARWALKHGDLNRADQLAGELCQDGVEVEEAKALMRDVRARREGGGG
|
||||
>1617752at2759_2004952_1:000010
|
||||
MPSSFVTPGQQRYLRACMVCSIVMTYSRFRDEGCPNCDEFLHLAGSQDQIESCTSQVFEGLITLANPAKSWIAKWQRLDGYVGGVYAIKVSGQLPDEIRTTLEDEYRIQYIPRDGTQTEADA
|
||||
>1588798at2759_215358_0:000010
|
||||
MTLPPTQQEPHTPEAFSLFVSFNHREPQNDDVMADLGIKAGDKVMMVWTQPSAPEGLKQHAEELAAIVGADGKVSVENLERLLLSSHSASSFDCVLSCLLADSSPVHTSETLEELARVLKPGGKLVLDEAVTGAETSQVRTAEKLISALKLSGFMSVTEVSKAELTAEALSALRTATGYQGNTLSRVRVSASKPNFEVGSSSQIKLSFGKKTPKPAEKPALDPNTVKMWTLSANDMGDDDVDLVDSDALLDEEDLKKPDPASLKVSCRDSGKKKACKNCSCGLAEELEQESTGKQKTNLPKSACGSCYLGDAFRCASCPYAGMPAFKPGEKIVLDKKTLTDA
|
||||
>1275837at2759_28005_1:000010
|
||||
MSSRDKASPSSPKETKGEHHLNEESDNDNNERRDEQQVTASAYLPSASRVDVHPLVLLSLVDHFARMNTKVRQKKRVVGLLLGRYKTDAAGTQVLDINNSFAVPFDEDPHNSDVWFFDTNYAEEMFVMHRRVHPKTKIVGWYASGPTVQQNDMLLHLLVADRFCANPVYCVVNTDPSHKGVPVLAYTTVQGREGARSLEFRNIPTHVGAEEAEEIGVEHLLRDLTDSTVTTLSSQLEERERSLEHMARVLVQIEEYLSDVASGALPASEDVLEALQELISLQPETYLKKKSLELNRFTNDRTIATFLGSIARCIGGLHEVILNRRVLARELKEIKARRAEAEEQRMDNEKNKIAEASPERKQ
|
||||
>1264469at2759_29058_0:000010
|
||||
MRPPLAIVRTYCTTAAPKSSNFIDEMKRNFIATNTFQKTLLSCGSAAISLLNPHRGDMIACLGEVTGESAIKYMRQKMTETEEGTEILKEKPRINSGTVSFDKLSQMPDNTLGRVYADFMTENNITADSRLPVQFIEDPELAYVMQRYREVHDLVHATLFMRTSMLGEVTVKWVEGIQTRLPMCISGGIWGAARLKPKHRQMYLKYYLPWAIKTGNNAKFMQGIYFEKRWDQDIDDFHKEMNIVRLVKK
|
||||
>673132at2759_326594_0:000010
|
||||
MTLLTVFKQFKKFQDAGKSVARSLSIKDDQESKKTCLYDLHIENNGKMVNFSGWLLPIQYRDSITASHQHTRTHASLFDVGHMLQSHVSGCDSGEFLESLTTADLQNLAQGGAALTVFTNKSGGILDDLIITKDRNDRFFVVSNAGRRNEDIELMLGRQAEMKSQGKNVTIEFLDPLEQGLIALQGPSAATTLQTLVKIDLTKLKFMNSVETKINQKSVRISRCGYTGEDGFEISVNGKDARTISEMILEVPDIKLAGLGARDSLRLEAGFCLYGHDINESITPVEASLQWLIAKRRREAANFPGAEFILEQIKNGPKKKRVGLILGQGPPARENATILTSAGERVGIVTSGGPSPTLGKPIAMGYVPLEHVHTGTPVLTEIRGKTYKALITKMPFVKPHYYSDKR
|
||||
>887370at2759_331117_1:000010
|
||||
MVVRSFLPLLSLLIALATFTSAASDYHEALVLQPLPQSSLLASFNFRGNTSQEAFDQRHFRYFPRALGQILQHTHTKELHIRFTTGRWDAESWGTRPWNGTKEGNTGVELWAWIDAPDSESAFARWISLTQSLSGLFCASLNFIDSTRTTRPVVSFEPIGDHSPSSDLHLLHGTLPGEVVCTENLTPFLKLLPCKGKAGVSSLLDGHKLFDASWQSMSVDVRPVCPQGGECLMQIEQTVDIVLDIERSKRPRDNPIPRPVPNDQLNCDNSKPYHSDDTCYPLERGSGKGWSLNEIFGRTLNGVCSLDEGQRPGEEAICLRVPHEQGVYTTSGVEETKRPDGYTRCFTLQPSGTFDLVIPEQSHTSLAPRDEPVLSAERTIVGHGQERGGMRIIFDNPSDAHPVDFIYFETLPWFLRPYVHTLRATITGRDGATRSVPVSHIVKETFYRPAIDRERGTQLELALSVPAASIVTLTYDFEKAILRYTEYPPDANRGFNVAPAVIKLSSANGNTIAHDTPIYMRTTSLLLPLPTPDFSMPYNVIILTSTVIALAFGSIFNLLVRRFVAADQAAALTAQTLKGRLLGKIVALRDRISGKRSKVE
|
||||
>166920at2759_38123_0:000010
|
||||
MAFLDFVFPLSKDELLERSDSQYYVRDQVTTSELPEKLKGCFESLHDDGPLFILENFDTLYGLLAHFKSVDFNQLHKVYTKLLIKSITEFIPILENYFSKETPDDELQNKYLNVIKMTVYILTEFIISFESRLQKEYQKVVIDVRARKVKVRAAIKHKEKYNWDWDFHLSNGLNSIHQLLKAKINKLWDPPVVEEEFVNTIANCCYKIIEDPCIASVKHKELRIFIFQVIGYLIKKYNHGISCTVKIVQLLKNCDHLVSPLAQAVTMFIRNHGCKSLVREIVREISEMDDGNEAAGQGQDNSKMVAAFLNEIAAEGPEYVIPAMDELLLNLEKESYMMRNCTLTILTELLLQVYKKENLSSEAKDQRDEYLNSLMEHIYDVHTFVRTKVLQLFQKLVIEKALPLAFTLQLVDRAIGRLMDKSSNVVKYAVQLLRTMIVSNPFAAKLGVEELKKKLAEAKATLTELEKNLPETSAQLSLVDEWNNIHYPVLLKIIREILEDGMYGCFLFYFL
|
||||
>1275837at2759_402676_1:000010
|
||||
MESMNDMFKKINAREKLVGWYHTGPQLRSSDLEINNLFKKYIPNPVLVIIDVQSKAVGLPTSAYFAVDEIKDDGTKSSLTFVHLPSSIEAEEAEEIGVEHLLRDTRDITAGTLATRVTEQVQSLRALEQRLDEIAVYLRKVVDGQLPINHTILGELQGVFNLLPNIFKTSNENDPLGLENGDERSFNINSNDQLMTVYLSSIVRSVIALHDLLDSLAASKAAEQEQDKLDLKQESTDSEKRATTAAVDEDPFMPN
|
||||
>1284731at2759_42254_0:000010
|
||||
MAEAGAVAAEYPSGGRARAARTLLDQVVLPGEELLLPEQEDADGPGGAGERPLQARDPYLKWGVRRACCEIPYVPVRGDHVIGIVTAKSGDTFKVDVGGSEPASLSYLAFEGATKRNRPNVQVGDLIYGQFVVANKDMEPEMVCIDGCGRANGMGVIGQDGLLFKVTLGLIRKLLAPDCEIIQELGKLYPLEIVFGMNGRIWVKAKTIQQTLILANILEACEHMTTDQRKQIFSRLAES
|
||||
>1228942at2759_45354_1:000010
|
||||
MNHDPFQWGRPRDEIYGHYDHKIAQASTSEFPSMHTQQPIITGTSVLGLKFDTGVVIAADHMGSYGSLLRFNNLERLICVGSETIVGVSGDISDFQHIERLLHELETEEEVYDTDGGHNLRAPNIHEYLSRVLYNRRLKMDPLWNAILVAGFNDDRTPFIRYVDLLGVTYGALALATGFGAHLAIPLLRKLVPYDLDYVKVKEADAREAVVNAMRVLYYRDARASDKYTLAVLSFKDGKVDVHFDQELKVTNQSWKFAEKVIGYGSKQQ
|
||||
>759498at2759_502779_1:000010
|
||||
MDGSRGSRKRKAVTRDLGEEPGVVSGNELHLDSADGSLADHSEDLDGSSDSEIELADDLNSDDDEEEEEEEEEDEDEINSDEVPSDIEPKVVGKKSGPGGEVDIIVRGDDTASDDDDDDDDDFESDDRPNYRVVKDANGNERYVYDEINPDDNSDYSETDENANTIGNIPLSFYDQYPHIGYNINGKKIMRPAKGQALDALLDSIELPKGFTGLTDPATGKPLELTQDELELLRKVQMNEITEEGYDPYQPTIEYFTSKLEVMPLSAAPEPKRRFVPSKHEAKRVMKLVKAIREGRILPYKQPAEEDEAEEGVQTYDIWANETPRADHPMHIPAPKLPPPGYEESYHPPPEYLPDEKEKSAWLNTDPEDRETEYLPTDHDALRKVPGYESFVKEKFERCLDLYLAPRVRRSKLNIDPESLLPKLPSPEELKPFPSTCATLFRGHQGRVRTLAIDPTGVWLASGGDDGTVRVWDILTGRQFWSVALSGDDAINVVRWRPGKDAVVLAAAAGDSIFLMVPPVLDPEMEKASFEVVDAGWGYAKTSPSTFTSTDSTKTSPVQWTRPSSSLLDSGVQAVISLGYVAKSLSWHRRGDYFVTVCPGTSTPVSLAIAIHTLSKHLTQQPFRRRLKGGGPPQTAHFHPSKPILFVANQRTIRAYDLSRQTLVKILQPGARWISSFDIHPTSSSTSGGDNLIVGSYDRRLLWHDVDLSPRPYKTLRYHQKAIRAVRYHANYPLFADASDDGSLQIFHGSVTGDLLSNASIVPLKVLRGHKVTGELGVLDLDWHPKEAWCVSAGADGTCRLWM
|
||||
>375960at2759_51337_0:000010
|
||||
MFFREHIFNIIGAFDIPRFVYNSERKKFLPLLMTNHPAPNLLGTAKDKAELYRERYTLLHQRTHRHELFTPPVIGSYPNESGSKFQLKTIETLLGSTTKIGDVIVLGMITQLKEGKFFLEDPTRTVQLDLSQAQFHSGLYTEACFVLAEGKAYYGSINFFGGPSNTSVKTSTKLKQLEEENKDAMFVFVSDVWLDRAEVLEKLHIMFSGYSPAPPSCFILCGNFSSAPYGKNQIQALKDSLKTLADIICEYPNIHQSSRFVFVPGPKDPGFGSILPRPPLAESITSEFRQKIPFSVFTTNPCRIQYCTEEIIIFREDIVNKMCRNCVRFPSSNLDIPNHFVKTILSQGHLTPLPLYVCPVYWARFPSSNLDIPNHGSFPRSGFSFKVFYPSSKTVEDSKLQGF
|
||||
>919955at2759_5643_1:000010
|
||||
MAAPMAVDKAKAPKIDVDEFLTLAISETPAELHPFFESFRSLYSRKLWHQLTNKLFEFFDHPLSKPYRVDVFNKFVRDFGLRLNQLRLVEMGVKVSKEIDNPVTHLQFLTDLLERVNIEKSPEAHVLLLSSLAHAKLLYGDHEGTKNDIDAAWKVLDELSSVDPSVNAAYYGVAADYYKSKAEYAPYYKNSLLYLACIDPAKDLTAEERLLRAHDLGIAAFLGDTIYNFGELPILQENYPFLRQKICLMALIESVFKRGSYDRTMSFQTIAEETHLPLDEVEHLVMKALSLKLIKGSLDQVDQKAQITWVQPRVLSREQIGQLAQRLAAWNSKLHQVEERIAPEVLVNS
|
||||
>817008at2759_5849_1:000010
|
||||
MDKLKTIYIDSALSIIKGALCVILQIPTGRTTESIKKKQNNVGIITVKSIFKEPTISQYNDIKQLIKTKIEENCPFYNYQINRTIAEKIYGDTIYDNYGLSKEINEVNLIILEEWNINCNRNRVLKHSGLIKNIEINKFKYLNNKESLEVHFLVNPKYTFEELNTIYKNEEELNNFLLSPIIKVTNKKIYEIEDKKSEFSYLYEEDILPKNKVLPPSGIENVNYESSKVVTPWDVNIGEEGINYNKLIKEFGCSKISDEHIRKIEKLTNRKAHHFIRRGIFFSHRDLDFLLNYYEQNGYFYIYTGRGPSSLSMHLGHLIPFYFCKYLQDAFNVPLIIQLSDDEKFLFNQNYSLDDINRFTKENVKDIIAVGFNPELTFIFKNTEYANHLYPTVLAIHKKTTLNQSMNVFGFNNSDNIGKISYPSFQIAPCFSQCFPNFLKKNIPCLVPQGIDQDPYFRLSRDIAVKLALYKPVVIHSVFMPGLQGVNTKMSSTKKKDNKNMDSKQDINNSVIFLTDSPEQIKNKINKYAFSGGGATIAEHKEKGADLEKDISYQYLRYFLVDDEKLNEIGEKYKKGEMLSGEIKKILIDILTDLVQKHQEKRNSLTDEDILYFFNDNKSSLKKFKDM
|
||||
>1426075at2759_61621_0:000010
|
||||
MTASQPNPQLPQSLPALKTSGTCARLPSTGRKLHLRIARAHPRVSRELFRRSGCGCGAGLSSAETDIAFLFSASGYRSHILKTMSGSFYFVIVGHHDNPVLKWSFXPAGKAESKDDHRHLNQFIAHAALDLVDENMWLSNNMYLKTVDKFNEWFVSAFVTAGHMRFIMLHDIRQEDGIKNFFTDVYDLYIKFSMNPFYEPNSPIRSSAFDRKVQFLGKKHLLS
|
||||
>655400at2759_688394_1:000010
|
||||
MAASRSPRLSSLLLRTTPLSRPTWQRTLSTRGFATAISNKLDNVYDMVIVGGGIAGTALACSLATNPSMKDYRIALIEAMDLSNTNNWAPATGRYSNRVVSLTPASMQFFEKIGVADELYRDRIQPYNCMKVSDGVTNASIEFDTNLLSSSTNPDDLPIAYMIENVHLQHSILKTLQTSKGKGATVDILQKARVASIRMQEQDAKETKDTLDLSDWPIIEMENGQSLQARLLVGADGVNSPVRSFAKIESLGWDYNMHGVVATFKTDPSRKNDTAYQRFLPTGPIAMLPLGDGHASMVWSMPPDMAHKVKKIPAQAFCTLVNSAFRLSMEDLDYLRSKIDPTTFEPLCDFDSEYNWRQGVAKHGLGDMEMMERELAFPPIVESVDETSRASFPLRMRNSQQYFADRVVLVGDAAHTVHPLAGQGLNQGILDVACLSDILQRGASEGQDIGNLHLLREYASVRYLRNLLMISACDKLHRLYSTDFAPITWIRSLGLSSVNQLDFVKAEIMKYAMGIEQ
|
||||
>946128at2759_765440_1:000010
|
||||
MPTTVCTAKASYKKTPGQLELTETHLQWFADGKKAPSVRVLYAEAASLFCSKEGAAQIRLKLGLVGDDTGHNFTFTSPQSVAYKERETFKKELTNIISRNRSVPNVTTPRPPLNTSISSTTPAISNAPTPRSVVPPSRASTSRAPSVSSDGRTPIVPGSDPTSDFRLRKQVLVSNPELGALHRDLVMSGQITEAEFWEGREHLLLAQTATESQKRGRPGQLVDPRPETVEGGEVKIVITPQLVHDIFEEYPVVAKAYNDNVPNKLSEAEFWKRYFQSKLFNAHRASIRSSAAQHVVKDDKIFDKYLEKDDDELEPRRQRDEGINLFVNLGATREDHGETGNEQDITMQAGRQRGALPLIRKFNEHSERLLNSALGDEPTAKRRRIDAGKEDAYSQIDLDDLHDPEASAGIILEMQDRQRYFEGQMASAASAEAAAGKNLDIRAILGETKVNLHDWETNLAQLKINKKSGDAALLSMTENVSARLEIKMKKNDIPPELFSQMTTCQTAANEFLRQFWLSMYPPAADHQVLAPATPAQKAAKAAKMIGYLGKTHEKVDALIRTAQVEAVDAAKVEIVRAVCFVYIITVNFNANLQAMKPILDAVDRALAFYRSRKPPK
|
||||
>1287401at2759_870435_1:000010
|
||||
MSSSIVGSLTRGCRTPSVNINPHPFFRCRTSLYHGIGKPPSWLHSRTQLWRTIGTSSSKHTPPSSASVSARRPTAIPSYNASREQMYKTRNRNLLMYTSAVVILGVGITYAAVPLYRMFCSATGFAGTPSVVSTSSGRFDPSRLTPDTDARRIRVHFNADRAEALPWKFFPQQKYVEVLPGESSLAFYKARNESKKDIIGIATYNVTPDRVAPYFSKVECFCFEEQKLLAGEEVDMPLLFFIDKDILDDPSCRGVNDVVLSYTFFKARRNAQGHLEPDAEEDVVQRSLGFEGYEHSPRAETKKVEGSKANS
|
||||
12
src/busco/busco_run/test_data/script.sh
Normal file
12
src/busco/busco_run/test_data/script.sh
Normal file
@@ -0,0 +1,12 @@
|
||||
# busco test data
|
||||
|
||||
# Test data from https://github.com/snakemake/snakemake-wrappers/tree/master/bio/busco/test
|
||||
|
||||
if [ ! -d /tmp/snakemake-wrappers ]; then
|
||||
git clone --depth 1 --single-branch --branch master https://github.com/snakemake/snakemake-wrappers /tmp/snakemake-wrappers
|
||||
fi
|
||||
|
||||
cp -r /tmp/snakemake-wrappers/bio/busco/test/protein.fasta src/busco/test_data
|
||||
|
||||
# Test data from busco test data at https://gitlab.com/ezlab/busco/-/tree/master/test_data?ref_type=heads
|
||||
wget -O src/busco/test_data/genome.fna "https://gitlab.com/ezlab/busco/-/raw/master/test_data/eukaryota/genome.fna?ref_type=heads&inline=false"
|
||||
484
src/cutadapt/config.vsh.yaml
Normal file
484
src/cutadapt/config.vsh.yaml
Normal file
@@ -0,0 +1,484 @@
|
||||
name: cutadapt
|
||||
description: |
|
||||
Cutadapt removes adapter sequences from high-throughput sequencing reads.
|
||||
keywords: [RNA-seq, scRNA-seq, high-throughput]
|
||||
links:
|
||||
homepage: https://cutadapt.readthedocs.io
|
||||
documentation: https://cutadapt.readthedocs.io
|
||||
repository: https://github.com/marcelm/cutadapt
|
||||
references:
|
||||
doi: 10.14806/ej.17.1.200
|
||||
license: MIT
|
||||
authors:
|
||||
- __merge__: /src/_authors/toni_verbeiren.yaml
|
||||
roles: [ author, maintainer ]
|
||||
argument_groups:
|
||||
####################################################################
|
||||
- name: Specify Adapters for R1
|
||||
arguments:
|
||||
- name: --adapter
|
||||
alternatives: [-a]
|
||||
type: string
|
||||
multiple: true
|
||||
description: |
|
||||
Sequence of an adapter ligated to the 3' end (paired data:
|
||||
of the first read). The adapter and subsequent bases are
|
||||
trimmed. If a '$' character is appended ('anchoring'), the
|
||||
adapter is only found if it is a suffix of the read.
|
||||
required: false
|
||||
- name: --front
|
||||
alternatives: [-g]
|
||||
type: string
|
||||
multiple: true
|
||||
description: |
|
||||
Sequence of an adapter ligated to the 5' end (paired data:
|
||||
of the first read). The adapter and any preceding bases
|
||||
are trimmed. Partial matches at the 5' end are allowed. If
|
||||
a '^' character is prepended ('anchoring'), the adapter is
|
||||
only found if it is a prefix of the read.
|
||||
required: false
|
||||
- name: --anywhere
|
||||
alternatives: [-b]
|
||||
type: string
|
||||
multiple: true
|
||||
description: |
|
||||
Sequence of an adapter that may be ligated to the 5' or 3'
|
||||
end (paired data: of the first read). Both types of
|
||||
matches as described under -a and -g are allowed. If the
|
||||
first base of the read is part of the match, the behavior
|
||||
is as with -g, otherwise as with -a. This option is mostly
|
||||
for rescuing failed library preparations - do not use if
|
||||
you know which end your adapter was ligated to!
|
||||
required: false
|
||||
|
||||
####################################################################
|
||||
- name: Specify Adapters using Fasta files for R1
|
||||
arguments:
|
||||
- name: --adapter_fasta
|
||||
type: file
|
||||
multiple: true
|
||||
description: |
|
||||
Fasta file containing sequences of an adapter ligated to the 3' end (paired data:
|
||||
of the first read). The adapter and subsequent bases are
|
||||
trimmed. If a '$' character is appended ('anchoring'), the
|
||||
adapter is only found if it is a suffix of the read.
|
||||
required: false
|
||||
- name: --front_fasta
|
||||
type: file
|
||||
description: |
|
||||
Fasta file containing sequences of an adapter ligated to the 5' end (paired data:
|
||||
of the first read). The adapter and any preceding bases
|
||||
are trimmed. Partial matches at the 5' end are allowed. If
|
||||
a '^' character is prepended ('anchoring'), the adapter is
|
||||
only found if it is a prefix of the read.
|
||||
required: false
|
||||
- name: --anywhere_fasta
|
||||
type: file
|
||||
description: |
|
||||
Fasta file containing sequences of an adapter that may be ligated to the 5' or 3'
|
||||
end (paired data: of the first read). Both types of
|
||||
matches as described under -a and -g are allowed. If the
|
||||
first base of the read is part of the match, the behavior
|
||||
is as with -g, otherwise as with -a. This option is mostly
|
||||
for rescuing failed library preparations - do not use if
|
||||
you know which end your adapter was ligated to!
|
||||
required: false
|
||||
|
||||
####################################################################
|
||||
- name: Specify Adapters for R2
|
||||
arguments:
|
||||
- name: --adapter_r2
|
||||
alternatives: [-A]
|
||||
type: string
|
||||
multiple: true
|
||||
description: |
|
||||
Sequence of an adapter ligated to the 3' end (paired data:
|
||||
of the first read). The adapter and subsequent bases are
|
||||
trimmed. If a '$' character is appended ('anchoring'), the
|
||||
adapter is only found if it is a suffix of the read.
|
||||
required: false
|
||||
- name: --front_r2
|
||||
alternatives: [-G]
|
||||
type: string
|
||||
multiple: true
|
||||
description: |
|
||||
Sequence of an adapter ligated to the 5' end (paired data:
|
||||
of the first read). The adapter and any preceding bases
|
||||
are trimmed. Partial matches at the 5' end are allowed. If
|
||||
a '^' character is prepended ('anchoring'), the adapter is
|
||||
only found if it is a prefix of the read.
|
||||
required: false
|
||||
- name: --anywhere_r2
|
||||
alternatives: [-B]
|
||||
type: string
|
||||
multiple: true
|
||||
description: |
|
||||
Sequence of an adapter that may be ligated to the 5' or 3'
|
||||
end (paired data: of the first read). Both types of
|
||||
matches as described under -a and -g are allowed. If the
|
||||
first base of the read is part of the match, the behavior
|
||||
is as with -g, otherwise as with -a. This option is mostly
|
||||
for rescuing failed library preparations - do not use if
|
||||
you know which end your adapter was ligated to!
|
||||
required: false
|
||||
|
||||
####################################################################
|
||||
- name: Specify Adapters using Fasta files for R2
|
||||
arguments:
|
||||
- name: --adapter_r2_fasta
|
||||
type: file
|
||||
description: |
|
||||
Fasta file containing sequences of an adapter ligated to the 3' end (paired data:
|
||||
of the first read). The adapter and subsequent bases are
|
||||
trimmed. If a '$' character is appended ('anchoring'), the
|
||||
adapter is only found if it is a suffix of the read.
|
||||
required: false
|
||||
- name: --front_r2_fasta
|
||||
type: file
|
||||
description: |
|
||||
Fasta file containing sequences of an adapter ligated to the 5' end (paired data:
|
||||
of the first read). The adapter and any preceding bases
|
||||
are trimmed. Partial matches at the 5' end are allowed. If
|
||||
a '^' character is prepended ('anchoring'), the adapter is
|
||||
only found if it is a prefix of the read.
|
||||
required: false
|
||||
- name: --anywhere_r2_fasta
|
||||
type: file
|
||||
description: |
|
||||
Fasta file containing sequences of an adapter that may be ligated to the 5' or 3'
|
||||
end (paired data: of the first read). Both types of
|
||||
matches as described under -a and -g are allowed. If the
|
||||
first base of the read is part of the match, the behavior
|
||||
is as with -g, otherwise as with -a. This option is mostly
|
||||
for rescuing failed library preparations - do not use if
|
||||
you know which end your adapter was ligated to!
|
||||
required: false
|
||||
|
||||
####################################################################
|
||||
- name: Paired-end options
|
||||
arguments:
|
||||
- name: --pair_adapters
|
||||
type: boolean_true
|
||||
description: |
|
||||
Treat adapters given with -a/-A etc. as pairs. Either both
|
||||
or none are removed from each read pair.
|
||||
- name: --pair_filter
|
||||
type: string
|
||||
choices: [any, both, first]
|
||||
description: |
|
||||
Which of the reads in a paired-end read have to match the
|
||||
filtering criterion in order for the pair to be filtered.
|
||||
- name: --interleaved
|
||||
type: boolean_true
|
||||
description: |
|
||||
Read and/or write interleaved paired-end reads.
|
||||
|
||||
####################################################################
|
||||
- name: Input parameters
|
||||
arguments:
|
||||
- name: --input
|
||||
type: file
|
||||
required: true
|
||||
description: |
|
||||
Input fastq file for single-end reads or R1 for paired-end reads.
|
||||
- name: --input_r2
|
||||
type: file
|
||||
required: false
|
||||
description: |
|
||||
Input fastq file for R2 in the case of paired-end reads.
|
||||
- name: --error_rate
|
||||
alternatives: [-E, --errors]
|
||||
type: double
|
||||
description: |
|
||||
Maximum allowed error rate (if 0 <= E < 1), or absolute
|
||||
number of errors for full-length adapter match (if E is an
|
||||
integer >= 1). Error rate = no. of errors divided by
|
||||
length of matching region. Default: 0.1 (10%).
|
||||
example: 0.1
|
||||
- name: --no_indels
|
||||
type: boolean_false
|
||||
description: |
|
||||
Allow only mismatches in alignments.
|
||||
|
||||
- name: --times
|
||||
type: integer
|
||||
alternatives: [-n]
|
||||
description: |
|
||||
Remove up to COUNT adapters from each read. Default: 1.
|
||||
example: 1
|
||||
- name: --overlap
|
||||
alternatives: [-O]
|
||||
type: integer
|
||||
description: |
|
||||
Require MINLENGTH overlap between read and adapter for an
|
||||
adapter to be found. The default is 3.
|
||||
example: 3
|
||||
- name: --match_read_wildcards
|
||||
type: boolean_true
|
||||
description: |
|
||||
Interpret IUPAC wildcards in reads.
|
||||
- name: --no_match_adapter_wildcards
|
||||
type: boolean_false
|
||||
description: |
|
||||
Do not interpret IUPAC wildcards in adapters.
|
||||
- name: --action
|
||||
type: string
|
||||
choices:
|
||||
- trim
|
||||
- retain
|
||||
- mask
|
||||
- lowercase
|
||||
- none
|
||||
description: |
|
||||
What to do if a match was found. trim: trim adapter and
|
||||
up- or downstream sequence; retain: trim, but retain
|
||||
adapter; mask: replace with 'N' characters; lowercase:
|
||||
convert to lowercase; none: leave unchanged.
|
||||
The default is trim.
|
||||
example: trim
|
||||
- name: --revcomp
|
||||
alternatives: [--rc]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Check both the read and its reverse complement for adapter
|
||||
matches. If match is on reverse-complemented version,
|
||||
output that one.
|
||||
|
||||
####################################################################
|
||||
- name: "Demultiplexing options"
|
||||
arguments:
|
||||
- name: "--demultiplex_mode"
|
||||
type: string
|
||||
choices: ["single", "unique_dual", "combinatorial_dual"]
|
||||
required: false
|
||||
description: |
|
||||
Enable demultiplexing and set the mode for it.
|
||||
With mode 'unique_dual', adapters from the first and second read are used,
|
||||
and the indexes from the reads are only used in pairs. This implies
|
||||
--pair_adapters.
|
||||
Enabling mode 'combinatorial_dual' allows all combinations of the sets of indexes
|
||||
on R1 and R2. It is necessary to write each read pair to an output
|
||||
file depending on the adapters found on both R1 and R2.
|
||||
Mode 'single', uses indexes or barcodes located at the 5'
|
||||
end of the R1 read (single).
|
||||
|
||||
####################################################################
|
||||
- name: Read modifications
|
||||
arguments:
|
||||
- name: --cut
|
||||
alternatives: [-u]
|
||||
type: integer
|
||||
multiple: true
|
||||
description: |
|
||||
Remove LEN bases from each read (or R1 if paired; use --cut_r2
|
||||
option for R2). If LEN is positive, remove bases from the
|
||||
beginning. If LEN is negative, remove bases from the end.
|
||||
Can be used twice if LENs have different signs. Applied
|
||||
*before* adapter trimming.
|
||||
- name: --cut_r2
|
||||
type: integer
|
||||
multiple: true
|
||||
description: |
|
||||
Remove LEN bases from each read (for R2). If LEN is positive, remove bases from the
|
||||
beginning. If LEN is negative, remove bases from the end.
|
||||
Can be used twice if LENs have different signs. Applied
|
||||
*before* adapter trimming.
|
||||
- name: --nextseq_trim
|
||||
type: string
|
||||
description: |
|
||||
NextSeq-specific quality trimming (each read). Trims also
|
||||
dark cycles appearing as high-quality G bases.
|
||||
- name: --quality_cutoff
|
||||
alternatives: [-q]
|
||||
type: string
|
||||
description: |
|
||||
Trim low-quality bases from 5' and/or 3' ends of each read
|
||||
before adapter removal. Applied to both reads if data is
|
||||
paired. If one value is given, only the 3' end is trimmed.
|
||||
If two comma-separated cutoffs are given, the 5' end is
|
||||
trimmed with the first cutoff, the 3' end with the second.
|
||||
- name: --quality_cutoff_r2
|
||||
alternatives: [-Q]
|
||||
type: string
|
||||
description: |
|
||||
Quality-trimming cutoff for R2. Default: same as for R1
|
||||
- name: --quality_base
|
||||
type: integer
|
||||
description: |
|
||||
Assume that quality values in FASTQ are encoded as
|
||||
ascii(quality + N). This needs to be set to 64 for some
|
||||
old Illumina FASTQ files. The default is 33.
|
||||
example: 33
|
||||
- name: --poly_a
|
||||
type: boolean_true
|
||||
description: Trim poly-A tails
|
||||
- name: --length
|
||||
alternatives: [-l]
|
||||
type: integer
|
||||
description: |
|
||||
Shorten reads to LENGTH. Positive values remove bases at
|
||||
the end while negative ones remove bases at the beginning.
|
||||
This and the following modifications are applied after
|
||||
adapter trimming.
|
||||
- name: --trim_n
|
||||
type: boolean_true
|
||||
description: Trim N's on ends of reads.
|
||||
- name: --length_tag
|
||||
type: string
|
||||
description: |
|
||||
Search for TAG followed by a decimal number in the
|
||||
description field of the read. Replace the decimal number
|
||||
with the correct length of the trimmed read. For example,
|
||||
use --length-tag 'length=' to correct fields like
|
||||
'length=123'.
|
||||
example: "length="
|
||||
- name: --strip_suffix
|
||||
type: string
|
||||
description: |
|
||||
Remove this suffix from read names if present. Can be
|
||||
given multiple times.
|
||||
- name: --prefix
|
||||
alternatives: [-x]
|
||||
type: string
|
||||
description: |
|
||||
Add this prefix to read names. Use {name} to insert the
|
||||
name of the matching adapter.
|
||||
- name: --suffix
|
||||
alternatives: [-y]
|
||||
type: string
|
||||
description: |
|
||||
Add this suffix to read names; can also include {name}
|
||||
- name: --rename
|
||||
type: string
|
||||
description: |
|
||||
Rename reads using TEMPLATE containing variables such as
|
||||
{id}, {adapter_name} etc. (see documentation)
|
||||
- name: --zero_cap
|
||||
alternatives: [-z]
|
||||
type: boolean_true
|
||||
description: Change negative quality values to zero.
|
||||
|
||||
####################################################################
|
||||
- name: Filtering of processed reads
|
||||
description: |
|
||||
Filters are applied after above read modifications. Paired-end reads are
|
||||
always discarded pairwise (see also --pair_filter).
|
||||
arguments:
|
||||
- name: --minimum_length
|
||||
alternatives: [-m]
|
||||
type: string
|
||||
description: |
|
||||
Discard reads shorter than LEN. Default is 0.
|
||||
When trimming paired-end reads, the minimum lengths for R1 and R2 can be specified separately by separating them with a colon (:).
|
||||
If the colon syntax is not used, the same minimum length applies to both reads, as discussed above.
|
||||
Also, one of the values can be omitted to impose no restrictions.
|
||||
For example, with -m 17:, the length of R1 must be at least 17, but the length of R2 is ignored.
|
||||
example: "0"
|
||||
- name: --maximum_length
|
||||
alternatives: [-M]
|
||||
type: string
|
||||
description: |
|
||||
Discard reads longer than LEN. Default: no limit.
|
||||
For paired reads, see the remark for --minimum_length
|
||||
- name: --max_n
|
||||
type: string
|
||||
description: |
|
||||
Discard reads with more than COUNT 'N' bases. If COUNT is
|
||||
a number between 0 and 1, it is interpreted as a fraction
|
||||
of the read length.
|
||||
- name: --max_expected_errors
|
||||
alternatives: [--max_ee]
|
||||
type: long
|
||||
description: |
|
||||
Discard reads whose expected number of errors (computed
|
||||
from quality values) exceeds ERRORS.
|
||||
- name: --max_average_error_rate
|
||||
alternatives: [--max_aer]
|
||||
type: long
|
||||
description: |
|
||||
as --max_expected_errors (see above), but divided by
|
||||
length to account for reads of varying length.
|
||||
- name: --discard_trimmed
|
||||
alternatives: [--discard]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Discard reads that contain an adapter. Use also -O to
|
||||
avoid discarding too many randomly matching reads.
|
||||
- name: --discard_untrimmed
|
||||
alternatives: [--trimmed_only]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Discard reads that do not contain an adapter.
|
||||
- name: --discard_casava
|
||||
type: boolean_true
|
||||
description: |
|
||||
Discard reads that did not pass CASAVA filtering (header
|
||||
has :Y:).
|
||||
|
||||
####################################################################
|
||||
- name: Output parameters
|
||||
arguments:
|
||||
- name: --report
|
||||
type: string
|
||||
choices: [full, minimal]
|
||||
description: |
|
||||
Which type of report to print: 'full' (default) or 'minimal'.
|
||||
example: full
|
||||
- name: --json
|
||||
type: boolean_true
|
||||
description: |
|
||||
Write report in JSON format to this file.
|
||||
- name: --output
|
||||
type: file
|
||||
description: |
|
||||
Glob pattern for matching the expected output files.
|
||||
Should include `$output_dir`.
|
||||
example: "fastq/*_001.fast[a,q]"
|
||||
direction: output
|
||||
required: true
|
||||
must_exist: true
|
||||
multiple: true
|
||||
- name: --fasta
|
||||
type: boolean_true
|
||||
description: |
|
||||
Output FASTA to standard output even on FASTQ input.
|
||||
- name: --info_file
|
||||
type: boolean_true
|
||||
description: |
|
||||
Write information about each read and its adapter matches
|
||||
into info.txt in the output directory.
|
||||
See the documentation for the file format.
|
||||
# - name: -Z
|
||||
# - name: --rest_file
|
||||
# - name: --wildcard-file
|
||||
# - name: --too_short_output
|
||||
# - name: --too_long_output
|
||||
# - name: --untrimmed_output
|
||||
# - name: --untrimmed_paired_output
|
||||
# - name: too_short_paired_output
|
||||
# - name: too_long_paired_output
|
||||
- name: Debug
|
||||
arguments:
|
||||
- type: boolean_true
|
||||
name: --debug
|
||||
description: Print debug information
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
|
||||
engines:
|
||||
- type: docker
|
||||
image: python:3.12
|
||||
setup:
|
||||
- type: python
|
||||
pip:
|
||||
- cutadapt
|
||||
- type: docker
|
||||
run: |
|
||||
cutadapt --version | sed 's/\(.*\)/cutadapt: "\1"/' > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
218
src/cutadapt/help.txt
Normal file
218
src/cutadapt/help.txt
Normal file
@@ -0,0 +1,218 @@
|
||||
cutadapt version 4.6
|
||||
|
||||
Copyright (C) 2010 Marcel Martin <marcel.martin@scilifelab.se> and contributors
|
||||
|
||||
Cutadapt removes adapter sequences from high-throughput sequencing reads.
|
||||
|
||||
Usage:
|
||||
cutadapt -a ADAPTER [options] [-o output.fastq] input.fastq
|
||||
|
||||
For paired-end reads:
|
||||
cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq
|
||||
|
||||
Replace "ADAPTER" with the actual sequence of your 3' adapter. IUPAC wildcard
|
||||
characters are supported. All reads from input.fastq will be written to
|
||||
output.fastq with the adapter sequence removed. Adapter matching is
|
||||
error-tolerant. Multiple adapter sequences can be given (use further -a
|
||||
options), but only the best-matching adapter will be removed.
|
||||
|
||||
Input may also be in FASTA format. Compressed input and output is supported and
|
||||
auto-detected from the file name (.gz, .xz, .bz2). Use the file name '-' for
|
||||
standard input/output. Without the -o option, output is sent to standard output.
|
||||
|
||||
Citation:
|
||||
|
||||
Marcel Martin. Cutadapt removes adapter sequences from high-throughput
|
||||
sequencing reads. EMBnet.Journal, 17(1):10-12, May 2011.
|
||||
http://dx.doi.org/10.14806/ej.17.1.200
|
||||
|
||||
Run "cutadapt --help" to see all command-line options.
|
||||
See https://cutadapt.readthedocs.io/ for full documentation.
|
||||
|
||||
Options:
|
||||
-h, --help Show this help message and exit
|
||||
--version Show version number and exit
|
||||
--debug Print debug log. Use twice to also print DP matrices
|
||||
-j CORES, --cores CORES
|
||||
Number of CPU cores to use. Use 0 to auto-detect. Default:
|
||||
1
|
||||
|
||||
Finding adapters:
|
||||
Parameters -a, -g, -b specify adapters to be removed from each read (or from
|
||||
R1 if data is paired-end. If specified multiple times, only the best matching
|
||||
adapter is trimmed (but see the --times option). Use notation 'file:FILE' to
|
||||
read adapter sequences from a FASTA file.
|
||||
|
||||
-a ADAPTER, --adapter ADAPTER
|
||||
Sequence of an adapter ligated to the 3' end (paired data:
|
||||
of the first read). The adapter and subsequent bases are
|
||||
trimmed. If a '$' character is appended ('anchoring'), the
|
||||
adapter is only found if it is a suffix of the read.
|
||||
-g ADAPTER, --front ADAPTER
|
||||
Sequence of an adapter ligated to the 5' end (paired data:
|
||||
of the first read). The adapter and any preceding bases
|
||||
are trimmed. Partial matches at the 5' end are allowed. If
|
||||
a '^' character is prepended ('anchoring'), the adapter is
|
||||
only found if it is a prefix of the read.
|
||||
-b ADAPTER, --anywhere ADAPTER
|
||||
Sequence of an adapter that may be ligated to the 5' or 3'
|
||||
end (paired data: of the first read). Both types of
|
||||
matches as described under -a and -g are allowed. If the
|
||||
first base of the read is part of the match, the behavior
|
||||
is as with -g, otherwise as with -a. This option is mostly
|
||||
for rescuing failed library preparations - do not use if
|
||||
you know which end your adapter was ligated to!
|
||||
-e E, --error-rate E, --errors E
|
||||
Maximum allowed error rate (if 0 <= E < 1), or absolute
|
||||
number of errors for full-length adapter match (if E is an
|
||||
integer >= 1). Error rate = no. of errors divided by
|
||||
length of matching region. Default: 0.1 (10%)
|
||||
--no-indels Allow only mismatches in alignments. Default: allow both
|
||||
mismatches and indels
|
||||
-n COUNT, --times COUNT
|
||||
Remove up to COUNT adapters from each read. Default: 1
|
||||
-O MINLENGTH, --overlap MINLENGTH
|
||||
Require MINLENGTH overlap between read and adapter for an
|
||||
adapter to be found. Default: 3
|
||||
--match-read-wildcards
|
||||
Interpret IUPAC wildcards in reads. Default: False
|
||||
-N, --no-match-adapter-wildcards
|
||||
Do not interpret IUPAC wildcards in adapters.
|
||||
--action {trim,retain,mask,lowercase,none}
|
||||
What to do if a match was found. trim: trim adapter and
|
||||
up- or downstream sequence; retain: trim, but retain
|
||||
adapter; mask: replace with 'N' characters; lowercase:
|
||||
convert to lowercase; none: leave unchanged. Default: trim
|
||||
--rc, --revcomp Check both the read and its reverse complement for adapter
|
||||
matches. If match is on reverse-complemented version,
|
||||
output that one. Default: check only read
|
||||
|
||||
Additional read modifications:
|
||||
-u LEN, --cut LEN Remove LEN bases from each read (or R1 if paired; use -U
|
||||
option for R2). If LEN is positive, remove bases from the
|
||||
beginning. If LEN is negative, remove bases from the end.
|
||||
Can be used twice if LENs have different signs. Applied
|
||||
*before* adapter trimming.
|
||||
--nextseq-trim 3'CUTOFF
|
||||
NextSeq-specific quality trimming (each read). Trims also
|
||||
dark cycles appearing as high-quality G bases.
|
||||
-q [5'CUTOFF,]3'CUTOFF, --quality-cutoff [5'CUTOFF,]3'CUTOFF
|
||||
Trim low-quality bases from 5' and/or 3' ends of each read
|
||||
before adapter removal. Applied to both reads if data is
|
||||
paired. If one value is given, only the 3' end is trimmed.
|
||||
If two comma-separated cutoffs are given, the 5' end is
|
||||
trimmed with the first cutoff, the 3' end with the second.
|
||||
--quality-base N Assume that quality values in FASTQ are encoded as
|
||||
ascii(quality + N). This needs to be set to 64 for some
|
||||
old Illumina FASTQ files. Default: 33
|
||||
--poly-a Trim poly-A tails
|
||||
--length LENGTH, -l LENGTH
|
||||
Shorten reads to LENGTH. Positive values remove bases at
|
||||
the end while negative ones remove bases at the beginning.
|
||||
This and the following modifications are applied after
|
||||
adapter trimming.
|
||||
--trim-n Trim N's on ends of reads.
|
||||
--length-tag TAG Search for TAG followed by a decimal number in the
|
||||
description field of the read. Replace the decimal number
|
||||
with the correct length of the trimmed read. For example,
|
||||
use --length-tag 'length=' to correct fields like
|
||||
'length=123'.
|
||||
--strip-suffix STRIP_SUFFIX
|
||||
Remove this suffix from read names if present. Can be
|
||||
given multiple times.
|
||||
-x PREFIX, --prefix PREFIX
|
||||
Add this prefix to read names. Use {name} to insert the
|
||||
name of the matching adapter.
|
||||
-y SUFFIX, --suffix SUFFIX
|
||||
Add this suffix to read names; can also include {name}
|
||||
--rename TEMPLATE Rename reads using TEMPLATE containing variables such as
|
||||
{id}, {adapter_name} etc. (see documentation)
|
||||
--zero-cap, -z Change negative quality values to zero.
|
||||
|
||||
Filtering of processed reads:
|
||||
Filters are applied after above read modifications. Paired-end reads are
|
||||
always discarded pairwise (see also --pair-filter).
|
||||
|
||||
-m LEN[:LEN2], --minimum-length LEN[:LEN2]
|
||||
Discard reads shorter than LEN. Default: 0
|
||||
-M LEN[:LEN2], --maximum-length LEN[:LEN2]
|
||||
Discard reads longer than LEN. Default: no limit
|
||||
--max-n COUNT Discard reads with more than COUNT 'N' bases. If COUNT is
|
||||
a number between 0 and 1, it is interpreted as a fraction
|
||||
of the read length.
|
||||
--max-expected-errors ERRORS, --max-ee ERRORS
|
||||
Discard reads whose expected number of errors (computed
|
||||
from quality values) exceeds ERRORS.
|
||||
--max-average-error-rate ERROR_RATE, --max-aer ERROR_RATE
|
||||
as --max-expected-errors (see above), but divided by
|
||||
length to account for reads of varying length.
|
||||
--discard-trimmed, --discard
|
||||
Discard reads that contain an adapter. Use also -O to
|
||||
avoid discarding too many randomly matching reads.
|
||||
--discard-untrimmed, --trimmed-only
|
||||
Discard reads that do not contain an adapter.
|
||||
--discard-casava Discard reads that did not pass CASAVA filtering (header
|
||||
has :Y:).
|
||||
|
||||
Output:
|
||||
--quiet Print only error messages.
|
||||
--report {full,minimal}
|
||||
Which type of report to print: 'full' or 'minimal'.
|
||||
Default: full
|
||||
--json FILE Dump report in JSON format to FILE
|
||||
-o FILE, --output FILE
|
||||
Write trimmed reads to FILE. FASTQ or FASTA format is
|
||||
chosen depending on input. Summary report is sent to
|
||||
standard output. Use '{name}' for demultiplexing (see
|
||||
docs). Default: write to standard output
|
||||
--fasta Output FASTA to standard output even on FASTQ input.
|
||||
-Z Use compression level 1 for gzipped output files (faster,
|
||||
but uses more space)
|
||||
--info-file FILE Write information about each read and its adapter matches
|
||||
into FILE. See the documentation for the file format.
|
||||
-r FILE, --rest-file FILE
|
||||
When the adapter matches in the middle of a read, write
|
||||
the rest (after the adapter) to FILE.
|
||||
--wildcard-file FILE When the adapter has N wildcard bases, write adapter bases
|
||||
matching wildcard positions to FILE. (Inaccurate with
|
||||
indels.)
|
||||
--too-short-output FILE
|
||||
Write reads that are too short (according to length
|
||||
specified by -m) to FILE. Default: discard reads
|
||||
--too-long-output FILE
|
||||
Write reads that are too long (according to length
|
||||
specified by -M) to FILE. Default: discard reads
|
||||
--untrimmed-output FILE
|
||||
Write reads that do not contain any adapter to FILE.
|
||||
Default: output to same file as trimmed reads
|
||||
|
||||
Paired-end options:
|
||||
The -A/-G/-B/-U/-Q options work like their lowercase counterparts, but are
|
||||
applied to R2 (second read in pair)
|
||||
|
||||
-A ADAPTER 3' adapter to be removed from R2
|
||||
-G ADAPTER 5' adapter to be removed from R2
|
||||
-B ADAPTER 5'/3 adapter to be removed from R2
|
||||
-U LENGTH Remove LENGTH bases from R2
|
||||
-Q [5'CUTOFF,]3'CUTOFF
|
||||
Quality-trimming cutoff for R2. Default: same as for R1
|
||||
-p FILE, --paired-output FILE
|
||||
Write R2 to FILE.
|
||||
--pair-adapters Treat adapters given with -a/-A etc. as pairs. Either both
|
||||
or none are removed from each read pair.
|
||||
--pair-filter {any,both,first}
|
||||
Which of the reads in a paired-end read have to match the
|
||||
filtering criterion in order for the pair to be filtered.
|
||||
Default: any
|
||||
--interleaved Read and/or write interleaved paired-end reads.
|
||||
--untrimmed-paired-output FILE
|
||||
Write second read in a pair to this FILE when no adapter
|
||||
was found. Use with --untrimmed-output. Default: output to
|
||||
same file as trimmed reads
|
||||
--too-short-paired-output FILE
|
||||
Write second read in a pair to this file if pair is too
|
||||
short.
|
||||
--too-long-paired-output FILE
|
||||
Write second read in a pair to this file if pair is too
|
||||
long.
|
||||
|
||||
258
src/cutadapt/script.sh
Normal file
258
src/cutadapt/script.sh
Normal file
@@ -0,0 +1,258 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
par_adapter='AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC;GGATCGGAAGAGCACACGTCTGAACTCCAGTCAC'
|
||||
par_input='src/cutadapt/test_data/se/a.fastq'
|
||||
par_report='full'
|
||||
par_json='false'
|
||||
par_fasta='false'
|
||||
par_info_file='false'
|
||||
par_debug='true'
|
||||
## VIASH END
|
||||
|
||||
function debug {
|
||||
[[ "$par_debug" == "true" ]] && echo "DEBUG: $@"
|
||||
}
|
||||
|
||||
output_dir=$(dirname $par_output)
|
||||
[[ ! -d $output_dir ]] && mkdir -p $output_dir
|
||||
|
||||
# Init
|
||||
###########################################################
|
||||
|
||||
echo ">> Paired-end data or not?"
|
||||
|
||||
mode=""
|
||||
if [[ -z $par_input_r2 ]]; then
|
||||
mode="se"
|
||||
echo " Single end"
|
||||
input="$par_input"
|
||||
else
|
||||
echo " Paired end"
|
||||
mode="pe"
|
||||
input="$par_input $par_input_r2"
|
||||
fi
|
||||
|
||||
# Adapter arguments
|
||||
# - paired and single-end
|
||||
# - string and fasta
|
||||
###########################################################
|
||||
|
||||
function add_flags {
|
||||
local arg=$1
|
||||
local flag=$2
|
||||
local prefix=$3
|
||||
[[ -z $prefix ]] && prefix=""
|
||||
|
||||
# This function should not be called if the input is empty
|
||||
# but check for it just in case
|
||||
if [[ -z $arg ]]; then
|
||||
return
|
||||
fi
|
||||
|
||||
local output=""
|
||||
IFS=';' read -r -a array <<< "$arg"
|
||||
for a in "${array[@]}"; do
|
||||
output="$output $flag $prefix$a"
|
||||
done
|
||||
echo $output
|
||||
}
|
||||
|
||||
debug ">> Parsing arguments dealing with adapters"
|
||||
adapter_args=$(echo \
|
||||
${par_adapter:+$(add_flags "$par_adapter" "--adapter")} \
|
||||
${par_adapter_fasta:+$(add_flags "$par_adapter_fasta" "--adapter" "file:")} \
|
||||
${par_front:+$(add_flags "$par_front" "--front")} \
|
||||
${par_front_fasta:+$(add_flags "$par_front_fasta" "--front" "file:")} \
|
||||
${par_anywhere:+$(add_flags "$par_anywhere" "--anywhere")} \
|
||||
${par_anywhere_fasta:+$(add_flags "$par_anywhere_fasta" "--anywhere" "file:")} \
|
||||
${par_adapter_r2:+$(add_flags "$par_adapter_r2" "-A")} \
|
||||
${par_adapter_fasta_r2:+$(add_flags "$par_adapter_fasta_r2" "-A" "file:")} \
|
||||
${par_front_r2:+$(add_flags "$par_front_r2" "-G")} \
|
||||
${par_front_fasta_r2:+$(add_flags "$par_front_fasta_r2" "-G" "file:")} \
|
||||
${par_anywhere_r2:+$(add_flags "$par_anywhere_r2" "-B")} \
|
||||
${par_anywhere_fasta_r2:+$(add_flags "$par_anywhere_fasta_r2" "-B" "file:")} \
|
||||
)
|
||||
|
||||
debug "Arguments to cutadapt:"
|
||||
debug "$adapter_args"
|
||||
debug
|
||||
|
||||
# Paired-end options
|
||||
###########################################################
|
||||
echo ">> Parsing arguments for paired-end reads"
|
||||
[[ "$par_pair_adapters" == "false" ]] && unset par_pair_adapters
|
||||
[[ "$par_interleaved" == "false" ]] && unset par_interleaved
|
||||
|
||||
paired_args=$(echo \
|
||||
${par_pair_adapters:+--pair-adapters} \
|
||||
${par_pair_filter:+--pair-filter "${par_pair_filter}"} \
|
||||
${par_interleaved:+--interleaved}
|
||||
)
|
||||
debug "Arguments to cutadapt:"
|
||||
debug $paired_args
|
||||
debug
|
||||
|
||||
# Input arguments
|
||||
###########################################################
|
||||
echo ">> Parsing input arguments"
|
||||
[[ "$par_no_indels" == "true" ]] && unset par_no_indels
|
||||
[[ "$par_match_read_wildcards" == "false" ]] && unset par_match_read_wildcards
|
||||
[[ "$par_no_match_adapter_wildcards" == "true" ]] && unset par_no_match_adapter_wildcards
|
||||
[[ "$par_revcomp" == "false" ]] && unset par_revcomp
|
||||
|
||||
input_args=$(echo \
|
||||
${par_error_rate:+--error-rate "${par_error_rate}"} \
|
||||
${par_no_indels:+--no-indels} \
|
||||
${par_times:+--times "${par_times}"} \
|
||||
${par_overlap:+--overlap "${par_overlap}"} \
|
||||
${par_match_read_wildcards:+--match-read-wildcards} \
|
||||
${par_no_match_adapter_wildcards:+--no-match-adapter-wildcards} \
|
||||
${par_action:+--action "${par_action}"} \
|
||||
${par_revcomp:+--revcomp} \
|
||||
)
|
||||
debug "Arguments to cutadapt:"
|
||||
debug $input_args
|
||||
debug
|
||||
|
||||
# Read modifications
|
||||
###########################################################
|
||||
echo ">> Parsing read modification arguments"
|
||||
[[ "$par_poly_a" == "false" ]] && unset par_poly_a
|
||||
[[ "$par_trim_n" == "false" ]] && unset par_trim_n
|
||||
[[ "$par_zero_cap" == "false" ]] && unset par_zero_cap
|
||||
|
||||
mod_args=$(echo \
|
||||
${par_cut:+--cut "${par_cut}"} \
|
||||
${par_cut_r2:+--cut_r2 "${par_cut_r2}"} \
|
||||
${par_nextseq_trim:+--nextseq-trim "${par_nextseq_trim}"} \
|
||||
${par_quality_cutoff:+--quality-cutoff "${par_quality_cutoff}"} \
|
||||
${par_quality_cutoff_r2:+-Q "${par_quality_cutoff_r2}"} \
|
||||
${par_quality_base:+--quality-base "${par_quality_base}"} \
|
||||
${par_poly_a:+--poly-a} \
|
||||
${par_length:+--length "${par_length}"} \
|
||||
${par_trim_n:+--trim-n} \
|
||||
${par_length_tag:+--length-tag "${par_length_tag}"} \
|
||||
${par_strip_suffix:+--strip-suffix "${par_strip_suffix}"} \
|
||||
${par_prefix:+--prefix "${par_prefix}"} \
|
||||
${par_suffix:+--suffix "${par_suffix}"} \
|
||||
${par_rename:+--rename "${par_rename}"} \
|
||||
${par_zero_cap:+--zero-cap} \
|
||||
)
|
||||
debug "Arguments to cutadapt:"
|
||||
debug $mod_args
|
||||
debug
|
||||
|
||||
# Filtering of processed reads arguments
|
||||
###########################################################
|
||||
echo ">> Filtering of processed reads arguments"
|
||||
[[ "$par_discard_trimmed" == "false" ]] && unset par_discard_trimmed
|
||||
[[ "$par_discard_untrimmed" == "false" ]] && unset par_discard_untrimmed
|
||||
[[ "$par_discard_casava" == "false" ]] && unset par_discard_casava
|
||||
|
||||
# Parse and transform the minimum and maximum length arguments
|
||||
[[ -z $par_minimum_length ]]
|
||||
|
||||
filter_args=$(echo \
|
||||
${par_minimum_length:+--minimum-length "${par_minimum_length}"} \
|
||||
${par_maximum_length:+--maximum-length "${par_maximum_length}"} \
|
||||
${par_max_n:+--max-n "${par_max_n}"} \
|
||||
${par_max_expected_errors:+--max-expected-errors "${par_max_expected_errors}"} \
|
||||
${par_max_average_error_rate:+--max-average-error-rate "${par_max_average_error_rate}"} \
|
||||
${par_discard_trimmed:+--discard-trimmed} \
|
||||
${par_discard_untrimmed:+--discard-untrimmed} \
|
||||
${par_discard_casava:+--discard-casava} \
|
||||
)
|
||||
debug "Arguments to cutadapt:"
|
||||
debug $filter_args
|
||||
debug
|
||||
|
||||
# Optional output arguments
|
||||
###########################################################
|
||||
echo ">> Optional arguments"
|
||||
[[ "$par_json" == "false" ]] && unset par_json
|
||||
[[ "$par_fasta" == "false" ]] && unset par_fasta
|
||||
[[ "$par_info_file" == "false" ]] && unset par_info_file
|
||||
|
||||
optional_output_args=$(echo \
|
||||
${par_report:+--report "${par_report}"} \
|
||||
${par_json:+--json "report.json"} \
|
||||
${par_fasta:+--fasta} \
|
||||
${par_info_file:+--info-file "info.txt"} \
|
||||
)
|
||||
|
||||
debug "Arguments to cutadapt:"
|
||||
debug $optional_output_args
|
||||
debug
|
||||
|
||||
# Output arguments
|
||||
# We write the output to a directory rather than
|
||||
# individual files.
|
||||
###########################################################
|
||||
|
||||
if [[ -z $par_fasta ]]; then
|
||||
ext="fastq"
|
||||
else
|
||||
ext="fasta"
|
||||
fi
|
||||
|
||||
demultiplex_mode="$par_demultiplex_mode"
|
||||
if [[ $mode == "se" ]]; then
|
||||
if [[ "$demultiplex_mode" == "unique_dual" ]] || [[ "$demultiplex_mode" == "combinatorial_dual" ]]; then
|
||||
echo "Demultiplexing dual indexes is not possible with single-end data."
|
||||
exit 1
|
||||
fi
|
||||
prefix="trimmed_"
|
||||
if [[ ! -z "$demultiplex_mode" ]]; then
|
||||
prefix="{name}_"
|
||||
fi
|
||||
output_args=$(echo \
|
||||
--output "$output_dir/${prefix}001.$ext" \
|
||||
)
|
||||
else
|
||||
demultiplex_indicator_r1='{name}_'
|
||||
demultiplex_indicator_r2=$demultiplex_indicator_r1
|
||||
if [[ "$demultiplex_mode" == "combinatorial_dual" ]]; then
|
||||
demultiplex_indicator_r1='{name1}_{name2}_'
|
||||
demultiplex_indicator_r2='{name1}_{name2}_'
|
||||
fi
|
||||
prefix_r1="trimmed_"
|
||||
prefix_r2="trimmed_"
|
||||
if [[ ! -z "$demultiplex_mode" ]]; then
|
||||
prefix_r1=$demultiplex_indicator_r1
|
||||
prefix_r2=$demultiplex_indicator_r2
|
||||
fi
|
||||
output_args=$(echo \
|
||||
--output "$output_dir/${prefix_r1}R1_001.$ext" \
|
||||
--paired-output "$output_dir/${prefix_r2}R2_001.$ext" \
|
||||
)
|
||||
fi
|
||||
|
||||
debug "Arguments to cutadapt:"
|
||||
debug $output_args
|
||||
debug
|
||||
|
||||
# Full CLI
|
||||
# Set the --cores argument to 0 unless meta_cpus is set
|
||||
###########################################################
|
||||
echo ">> Running cutadapt"
|
||||
par_cpus=0
|
||||
[[ ! -z $meta_cpus ]] && par_cpus=$meta_cpus
|
||||
|
||||
cli=$(echo \
|
||||
$input \
|
||||
$adapter_args \
|
||||
$paired_args \
|
||||
$input_args \
|
||||
$mod_args \
|
||||
$filter_args \
|
||||
$optional_output_args \
|
||||
$output_args \
|
||||
--cores $par_cpus
|
||||
)
|
||||
|
||||
debug ">> Full CLI to be run:"
|
||||
debug cutadapt $cli | sed -e 's/--/\r\n --/g'
|
||||
debug
|
||||
|
||||
cutadapt $cli
|
||||
261
src/cutadapt/test.sh
Normal file
261
src/cutadapt/test.sh
Normal file
@@ -0,0 +1,261 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
set -eo pipefail
|
||||
|
||||
#############################################
|
||||
# helper functions
|
||||
assert_file_exists() {
|
||||
[ -f "$1" ] || { echo "File '$1' does not exist" && exit 1; }
|
||||
}
|
||||
assert_file_doesnt_exist() {
|
||||
[ ! -f "$1" ] || { echo "File '$1' exists but shouldn't" && exit 1; }
|
||||
}
|
||||
assert_file_empty() {
|
||||
[ ! -s "$1" ] || { echo "File '$1' is not empty but should be" && exit 1; }
|
||||
}
|
||||
assert_file_not_empty() {
|
||||
[ -s "$1" ] || { echo "File '$1' is empty but shouldn't be" && exit 1; }
|
||||
}
|
||||
assert_file_contains() {
|
||||
grep -q "$2" "$1" || { echo "File '$1' does not contain '$2'" && exit 1; }
|
||||
}
|
||||
assert_file_not_contains() {
|
||||
grep -q "$2" "$1" && { echo "File '$1' contains '$2' but shouldn't" && exit 1; }
|
||||
}
|
||||
#############################################
|
||||
|
||||
mkdir test_multiple_output
|
||||
cd test_multiple_output
|
||||
|
||||
echo "#############################################"
|
||||
echo "> Run cutadapt with multiple outputs"
|
||||
|
||||
cat > example.fa <<'EOF'
|
||||
>read1
|
||||
MYSEQUENCEADAPTER
|
||||
>read2
|
||||
MYSEQUENCEADAP
|
||||
>read3
|
||||
MYSEQUENCEADAPTERSOMETHINGELSE
|
||||
>read4
|
||||
MYSEQUENCEADABTER
|
||||
>read5
|
||||
MYSEQUENCEADAPTR
|
||||
>read6
|
||||
MYSEQUENCEADAPPTER
|
||||
>read7
|
||||
ADAPTERMYSEQUENCE
|
||||
>read8
|
||||
PTERMYSEQUENCE
|
||||
>read9
|
||||
SOMETHINGADAPTERMYSEQUENCE
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--report minimal \
|
||||
--output "out_test/*.fasta" \
|
||||
--adapter ADAPTER \
|
||||
--input example.fa \
|
||||
--fasta \
|
||||
--demultiplex_mode single \
|
||||
--no_match_adapter_wildcards \
|
||||
--json
|
||||
|
||||
echo ">> Checking output"
|
||||
assert_file_exists "report.json"
|
||||
assert_file_exists "out_test/1_001.fasta"
|
||||
assert_file_exists "out_test/unknown_001.fasta"
|
||||
|
||||
cd ..
|
||||
echo
|
||||
|
||||
#############################################
|
||||
mkdir test_simple_single_end
|
||||
cd test_simple_single_end
|
||||
|
||||
echo "#############################################"
|
||||
echo "> Run cutadapt on single-end data"
|
||||
|
||||
cat > example.fa <<'EOF'
|
||||
>read1
|
||||
MYSEQUENCEADAPTER
|
||||
>read2
|
||||
MYSEQUENCEADAP
|
||||
>read3
|
||||
MYSEQUENCEADAPTERSOMETHINGELSE
|
||||
>read4
|
||||
MYSEQUENCEADABTER
|
||||
>read5
|
||||
MYSEQUENCEADAPTR
|
||||
>read6
|
||||
MYSEQUENCEADAPPTER
|
||||
>read7
|
||||
ADAPTERMYSEQUENCE
|
||||
>read8
|
||||
PTERMYSEQUENCE
|
||||
>read9
|
||||
SOMETHINGADAPTERMYSEQUENCE
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--report minimal \
|
||||
--output "out_test1/*.fasta" \
|
||||
--adapter ADAPTER \
|
||||
--input example.fa \
|
||||
--demultiplex_mode single \
|
||||
--fasta \
|
||||
--no_match_adapter_wildcards \
|
||||
--json
|
||||
|
||||
echo ">> Checking output"
|
||||
assert_file_exists "report.json"
|
||||
assert_file_exists "out_test1/1_001.fasta"
|
||||
assert_file_exists "out_test1/unknown_001.fasta"
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
assert_file_not_empty "report.json"
|
||||
assert_file_not_empty "out_test1/1_001.fasta"
|
||||
assert_file_not_empty "out_test1/unknown_001.fasta"
|
||||
|
||||
echo ">> Check contents"
|
||||
for i in 1 2 3 7 9; do
|
||||
assert_file_contains "out_test1/1_001.fasta" ">read$i"
|
||||
done
|
||||
for i in 4 5 6 8; do
|
||||
assert_file_contains "out_test1/unknown_001.fasta" ">read$i"
|
||||
done
|
||||
|
||||
cd ..
|
||||
echo
|
||||
|
||||
#############################################
|
||||
mkdir test_multiple_single_end
|
||||
cd test_multiple_single_end
|
||||
|
||||
echo "#############################################"
|
||||
echo "> Run with a combination of inputs"
|
||||
|
||||
cat > example.fa <<'EOF'
|
||||
>read1
|
||||
ACGTACGTACGTAAAAA
|
||||
>read2
|
||||
ACGTACGTACGTCCCCC
|
||||
>read3
|
||||
ACGTACGTACGTGGGGG
|
||||
>read4
|
||||
ACGTACGTACGTTTTTT
|
||||
EOF
|
||||
|
||||
cat > adapters1.fasta <<'EOF'
|
||||
>adapter1
|
||||
CCCCC
|
||||
EOF
|
||||
|
||||
cat > adapters2.fasta <<'EOF'
|
||||
>adapter2
|
||||
GGGGG
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--report minimal \
|
||||
--output "out_test2/*.fasta" \
|
||||
--adapter AAAAA \
|
||||
--adapter_fasta adapters1.fasta \
|
||||
--adapter_fasta adapters2.fasta \
|
||||
--demultiplex_mode single \
|
||||
--input example.fa \
|
||||
--fasta \
|
||||
--json
|
||||
|
||||
echo ">> Checking output"
|
||||
assert_file_exists "report.json"
|
||||
assert_file_exists "out_test2/1_001.fasta"
|
||||
assert_file_exists "out_test2/adapter1_001.fasta"
|
||||
assert_file_exists "out_test2/adapter2_001.fasta"
|
||||
assert_file_exists "out_test2/unknown_001.fasta"
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
assert_file_not_empty "report.json"
|
||||
assert_file_not_empty "out_test2/1_001.fasta"
|
||||
assert_file_not_empty "out_test2/adapter1_001.fasta"
|
||||
assert_file_not_empty "out_test2/adapter2_001.fasta"
|
||||
assert_file_not_empty "out_test2/unknown_001.fasta"
|
||||
|
||||
echo ">> Check contents"
|
||||
assert_file_contains "out_test2/1_001.fasta" ">read1"
|
||||
assert_file_contains "out_test2/adapter1_001.fasta" ">read2"
|
||||
assert_file_contains "out_test2/adapter2_001.fasta" ">read3"
|
||||
assert_file_contains "out_test2/unknown_001.fasta" ">read4"
|
||||
|
||||
cd ..
|
||||
echo
|
||||
|
||||
#############################################
|
||||
mkdir test_simple_paired_end
|
||||
cd test_simple_paired_end
|
||||
|
||||
echo "#############################################"
|
||||
echo "> Run cutadapt on paired-end data"
|
||||
|
||||
cat > example_R1.fastq <<'EOF'
|
||||
@read1
|
||||
ACGTACGTACGTAAAAA
|
||||
+
|
||||
IIIIIIIIIIIIIIIII
|
||||
@read2
|
||||
ACGTACGTACGTCCCCC
|
||||
+
|
||||
IIIIIIIIIIIIIIIII
|
||||
EOF
|
||||
|
||||
cat > example_R2.fastq <<'EOF'
|
||||
@read1
|
||||
ACGTACGTACGTGGGGG
|
||||
+
|
||||
IIIIIIIIIIIIIIIII
|
||||
@read2
|
||||
ACGTACGTACGTTTTTT
|
||||
+
|
||||
IIIIIIIIIIIIIIIII
|
||||
EOF
|
||||
|
||||
"$meta_executable" \
|
||||
--report minimal \
|
||||
--output "out_test3/*.fastq" \
|
||||
--adapter AAAAA \
|
||||
--adapter_r2 GGGGG \
|
||||
--input example_R1.fastq \
|
||||
--input_r2 example_R2.fastq \
|
||||
--quality_cutoff 20 \
|
||||
--demultiplex_mode unique_dual \
|
||||
--json \
|
||||
---cpus 1
|
||||
|
||||
echo ">> Checking output"
|
||||
assert_file_exists "report.json"
|
||||
assert_file_exists "out_test3/1_R1_001.fastq"
|
||||
assert_file_exists "out_test3/1_R2_001.fastq"
|
||||
assert_file_exists "out_test3/unknown_R1_001.fastq"
|
||||
assert_file_exists "out_test3/unknown_R2_001.fastq"
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
assert_file_not_empty "report.json"
|
||||
assert_file_not_empty "out_test3/1_R1_001.fastq"
|
||||
assert_file_not_empty "out_test3/1_R2_001.fastq"
|
||||
assert_file_not_empty "out_test3/unknown_R1_001.fastq"
|
||||
|
||||
echo ">> Check contents"
|
||||
assert_file_contains "out_test3/1_R1_001.fastq" "@read1"
|
||||
assert_file_contains "out_test3/1_R2_001.fastq" "@read1"
|
||||
assert_file_contains "out_test3/unknown_R1_001.fastq" "@read2"
|
||||
assert_file_contains "out_test3/unknown_R2_001.fastq" "@read2"
|
||||
|
||||
cd ..
|
||||
echo
|
||||
|
||||
#############################################
|
||||
|
||||
echo "#############################################"
|
||||
echo "> Test successful"
|
||||
|
||||
199
src/falco/config.vsh.yaml
Normal file
199
src/falco/config.vsh.yaml
Normal file
@@ -0,0 +1,199 @@
|
||||
name: falco
|
||||
description: A C++ drop-in replacement of FastQC to assess the quality of sequence read data
|
||||
keywords: [qc, fastqc, sequencing]
|
||||
links:
|
||||
documentation: https://falco.readthedocs.io/en/latest/
|
||||
repository: https://github.com/smithlabcode/falco
|
||||
references:
|
||||
doi: 10.12688/f1000research.21142.2
|
||||
license: GPL-3.0
|
||||
requirements:
|
||||
commands: [falco]
|
||||
authors:
|
||||
- __merge__: /src/_authors/toni_verbeiren.yaml
|
||||
roles: [ author, maintainer ]
|
||||
|
||||
# Notes:
|
||||
# - falco as arguments similar to -subsample and we update those to --subsample
|
||||
# - The outdir argument is not required
|
||||
# - The input argument in falco is positional but we changed this to --input
|
||||
argument_groups:
|
||||
- name: Input arguments
|
||||
arguments:
|
||||
- name: --input
|
||||
required: true
|
||||
type: file
|
||||
multiple: true
|
||||
description: input fastq files
|
||||
example: input1.fastq;input2.fastq
|
||||
|
||||
- name: Run arguments
|
||||
arguments:
|
||||
- name: --nogroup
|
||||
type: boolean_true
|
||||
description: |
|
||||
Disable grouping of bases for reads >50bp.
|
||||
All reports will show data for every base in
|
||||
the read. WARNING: When using this option,
|
||||
your plots may end up a ridiculous size. You
|
||||
have been warned!
|
||||
- name: --contaminents
|
||||
type: file
|
||||
description: |
|
||||
Specifies a non-default file which contains
|
||||
the list of contaminants to screen
|
||||
overrepresented sequences against. The file
|
||||
must contain sets of named contaminants in
|
||||
the form name[tab]sequence. Lines prefixed
|
||||
with a hash will be ignored. Default:
|
||||
https://github.com/smithlabcode/falco/blob/v1.2.2/Configuration/contaminant_list.txt
|
||||
- name: --adapters
|
||||
type: file
|
||||
description: |
|
||||
Specifies a non-default file which contains
|
||||
the list of adapter sequences which will be
|
||||
explicity searched against the library. The
|
||||
file must contain sets of named adapters in
|
||||
the form name[tab]sequence. Lines prefixed
|
||||
with a hash will be ignored. Default:
|
||||
https://github.com/smithlabcode/falco/blob/v1.2.2/Configuration/adapter_list.txt
|
||||
- name: --limits
|
||||
type: file
|
||||
description: |
|
||||
Specifies a non-default file which contains
|
||||
a set of criteria which will be used to
|
||||
determine the warn/error limits for the
|
||||
various modules. This file can also be used
|
||||
to selectively remove some modules from the
|
||||
output all together. The format needs to
|
||||
mirror the default limits.txt file found in
|
||||
the Configuration folder. Default:
|
||||
https://github.com/smithlabcode/falco/blob/v1.2.2/Configuration/limits.txt
|
||||
- name: --subsample
|
||||
alternatives: [-s]
|
||||
type: integer
|
||||
example: 10
|
||||
description: |
|
||||
[Falco only] makes falco faster (but
|
||||
possibly less accurate) by only processing
|
||||
reads that are a multiple of this value (using
|
||||
0-based indexing to number reads).
|
||||
- name: --bisulfite
|
||||
alternatives: [-b]
|
||||
type: boolean_true
|
||||
description: |
|
||||
[Falco only] reads are whole genome
|
||||
bisulfite sequencing, and more Ts and fewer
|
||||
Cs are therefore expected and will be
|
||||
accounted for in base content.
|
||||
- name: --reverse_complliment
|
||||
alternatives: [-r]
|
||||
type: boolean_true
|
||||
description: |
|
||||
[Falco only] The input is a
|
||||
reverse-complement. All modules will be
|
||||
tested by swapping A/T and C/G
|
||||
|
||||
- name: Output arguments
|
||||
arguments:
|
||||
- name: --outdir
|
||||
alternatives: [-o]
|
||||
required: true
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Create all output files in the specified
|
||||
output directory. FALCO-SPECIFIC: If the
|
||||
directory does not exists, the program will
|
||||
create it.
|
||||
example: output
|
||||
- name: --format
|
||||
type: string
|
||||
choices: [bam, sam, bam_mapped, sam_mapped, fastq, fq, fastq.gz, fq.gz]
|
||||
alternatives: ["-f"]
|
||||
description: |
|
||||
Bypasses the normal sequence file format
|
||||
detection and forces the program to use the
|
||||
specified format. Validformats are bam, sam,
|
||||
bam_mapped, sam_mapped, fastq, fq, fastq.gz
|
||||
or fq.gz.
|
||||
- name: --data_filename
|
||||
alternatives: [-D]
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
[Falco only] Specify filename for FastQC
|
||||
data output (TXT). If not specified, it will
|
||||
be called fastq_data.txt in either the input
|
||||
file's directory or the one specified in the
|
||||
--output flag. Only available when running
|
||||
falco with a single input.
|
||||
- name: --report_filename
|
||||
alternatives: [-R]
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
[Falco only] Specify filename for FastQC
|
||||
report output (HTML). If not specified, it
|
||||
will be called fastq_report.html in either
|
||||
the input file's directory or the one
|
||||
specified in the --output flag. Only
|
||||
available when running falco with a single
|
||||
input.
|
||||
- name: --summary_filename
|
||||
alternatives: [-S]
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
[Falco only] Specify filename for the short
|
||||
summary output (TXT). If not specified, it
|
||||
will be called fastq_report.html in either
|
||||
the input file's directory or the one
|
||||
specified in the --output flag. Only
|
||||
available when running falco with a single
|
||||
input.
|
||||
|
||||
# Arguments not taken into account:
|
||||
#
|
||||
# -skip-data [Falco only] Do not create FastQC data text
|
||||
# file.
|
||||
# -skip-report [Falco only] Do not create FastQC report
|
||||
# HTML file.
|
||||
# -skip-summary [Falco only] Do not create FastQC summary
|
||||
# file
|
||||
# -K, -add-call [Falco only] add the command call call to
|
||||
# FastQC data output and FastQC report HTML
|
||||
# (this may break the parse of fastqc_data.txt
|
||||
# in programs that are very strict about the
|
||||
# FastQC output format).
|
||||
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
|
||||
engines:
|
||||
- type: docker
|
||||
image: debian:trixie-slim
|
||||
setup:
|
||||
- type: apt
|
||||
packages: [wget, build-essential, g++, zlib1g-dev, procps]
|
||||
- type: docker
|
||||
run: |
|
||||
wget https://github.com/smithlabcode/falco/releases/download/v1.2.2/falco-1.2.2.tar.gz -O /tmp/falco.tar.gz && \
|
||||
cd /tmp && \
|
||||
tar xvf falco.tar.gz && \
|
||||
cd falco-1.2.2 && \
|
||||
./configure && \
|
||||
make all && \
|
||||
make install
|
||||
- type: docker
|
||||
run: |
|
||||
echo "falco: \"$(falco -v | sed -n 's/^falco //p')\"" > /var/software_versions.txt
|
||||
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
156
src/falco/help.txt
Normal file
156
src/falco/help.txt
Normal file
@@ -0,0 +1,156 @@
|
||||
Usage: falco [OPTIONS] <seqfile1> <seqfile2> ...
|
||||
|
||||
Options:
|
||||
-h, --help Print this help file and exit
|
||||
-v, --version Print the version of the program and exit
|
||||
-o, --outdir Create all output files in the specified
|
||||
output directory. FALCO-SPECIFIC: If the
|
||||
directory does not exists, the program will
|
||||
create it. If this option is not set then
|
||||
the output file for each sequence file is
|
||||
created in the same directory as the
|
||||
sequence file which was processed.
|
||||
--casava [IGNORED BY FALCO] Files come from raw
|
||||
casava output. Files in the same sample
|
||||
group (differing only by the group number)
|
||||
will be analysed as a set rather than
|
||||
individually. Sequences with the filter flag
|
||||
set in the header will be excluded from the
|
||||
analysis. Files must have the same names
|
||||
given to them by casava (including being
|
||||
gzipped and ending with .gz) otherwise they
|
||||
won't be grouped together correctly.
|
||||
--nano [IGNORED BY FALCO] Files come from nanopore
|
||||
sequences and are in fast5 format. In this
|
||||
mode you can pass in directories to process
|
||||
and the program will take in all fast5 files
|
||||
within those directories and produce a
|
||||
single output file from the sequences found
|
||||
in all files.
|
||||
--nofilter [IGNORED BY FALCO] If running with --casava
|
||||
then don't remove read flagged by casava as
|
||||
poor quality when performing the QC
|
||||
analysis.
|
||||
--extract [ALWAYS ON IN FALCO] If set then the zipped
|
||||
output file will be uncompressed in the same
|
||||
directory after it has been created. By
|
||||
default this option will be set if fastqc is
|
||||
run in non-interactive mode.
|
||||
-j, --java [IGNORED BY FALCO] Provides the full path to
|
||||
the java binary you want to use to launch
|
||||
fastqc. If not supplied then java is assumed
|
||||
to be in your path.
|
||||
--noextract [IGNORED BY FALCO] Do not uncompress the
|
||||
output file after creating it. You should
|
||||
set this option if you do not wish to
|
||||
uncompress the output when running in
|
||||
non-interactive mode.
|
||||
--nogroup Disable grouping of bases for reads >50bp.
|
||||
All reports will show data for every base in
|
||||
the read. WARNING: When using this option,
|
||||
your plots may end up a ridiculous size. You
|
||||
have been warned!
|
||||
--min_length [NOT YET IMPLEMENTED IN FALCO] Sets an
|
||||
artificial lower limit on the length of the
|
||||
sequence to be shown in the report. As long
|
||||
as you set this to a value greater or equal
|
||||
to your longest read length then this will
|
||||
be the sequence length used to create your
|
||||
read groups. This can be useful for making
|
||||
directly comaparable statistics from
|
||||
datasets with somewhat variable read
|
||||
lengths.
|
||||
-f, --format Bypasses the normal sequence file format
|
||||
detection and forces the program to use the
|
||||
specified format. Validformats are bam, sam,
|
||||
bam_mapped, sam_mapped, fastq, fq, fastq.gz
|
||||
or fq.gz.
|
||||
-t, --threads [NOT YET IMPLEMENTED IN FALCO] Specifies the
|
||||
number of files which can be processed
|
||||
simultaneously. Each thread will be
|
||||
allocated 250MB of memory so you shouldn't
|
||||
run more threads than your available memory
|
||||
will cope with, and not more than 6 threads
|
||||
on a 32 bit machine [1]
|
||||
-c, --contaminants Specifies a non-default file which contains
|
||||
the list of contaminants to screen
|
||||
overrepresented sequences against. The file
|
||||
must contain sets of named contaminants in
|
||||
the form name[tab]sequence. Lines prefixed
|
||||
with a hash will be ignored. Default:
|
||||
/tmp/falco-1.2.2/Configuration/contaminant_list.txt
|
||||
-a, --adapters Specifies a non-default file which contains
|
||||
the list of adapter sequences which will be
|
||||
explicity searched against the library. The
|
||||
file must contain sets of named adapters in
|
||||
the form name[tab]sequence. Lines prefixed
|
||||
with a hash will be ignored. Default:
|
||||
/tmp/falco-1.2.2/Configuration/adapter_list.txt
|
||||
-l, --limits Specifies a non-default file which contains
|
||||
a set of criteria which will be used to
|
||||
determine the warn/error limits for the
|
||||
various modules. This file can also be used
|
||||
to selectively remove some modules from the
|
||||
output all together. The format needs to
|
||||
mirror the default limits.txt file found in
|
||||
the Configuration folder. Default:
|
||||
/tmp/falco-1.2.2/Configuration/limits.txt
|
||||
-k, --kmers [IGNORED BY FALCO AND ALWAYS SET TO 7]
|
||||
Specifies the length of Kmer to look for in
|
||||
the Kmer content module. Specified Kmer
|
||||
length must be between 2 and 10. Default
|
||||
length is 7 if not specified.
|
||||
-q, --quiet Supress all progress messages on stdout and
|
||||
only report errors.
|
||||
-d, --dir [IGNORED: FALCO DOES NOT CREATE TMP FILES]
|
||||
Selects a directory to be used for temporary
|
||||
files written when generating report images.
|
||||
Defaults to system temp directory if not
|
||||
specified.
|
||||
-s, -subsample [Falco only] makes falco faster (but
|
||||
possibly less accurate) by only processing
|
||||
reads that are multiple of this value (using
|
||||
0-based indexing to number reads). [1]
|
||||
-b, -bisulfite [Falco only] reads are whole genome
|
||||
bisulfite sequencing, and more Ts and fewer
|
||||
Cs are therefore expected and will be
|
||||
accounted for in base content.
|
||||
-r, -reverse-complement [Falco only] The input is a
|
||||
reverse-complement. All modules will be
|
||||
tested by swapping A/T and C/G
|
||||
-skip-data [Falco only] Do not create FastQC data text
|
||||
file.
|
||||
-skip-report [Falco only] Do not create FastQC report
|
||||
HTML file.
|
||||
-skip-summary [Falco only] Do not create FastQC summary
|
||||
file
|
||||
-D, -data-filename [Falco only] Specify filename for FastQC
|
||||
data output (TXT). If not specified, it will
|
||||
be called fastq_data.txt in either the input
|
||||
file's directory or the one specified in the
|
||||
--output flag. Only available when running
|
||||
falco with a single input.
|
||||
-R, -report-filename [Falco only] Specify filename for FastQC
|
||||
report output (HTML). If not specified, it
|
||||
will be called fastq_report.html in either
|
||||
the input file's directory or the one
|
||||
specified in the --output flag. Only
|
||||
available when running falco with a single
|
||||
input.
|
||||
-S, -summary-filename [Falco only] Specify filename for the short
|
||||
summary output (TXT). If not specified, it
|
||||
will be called fastq_report.html in either
|
||||
the input file's directory or the one
|
||||
specified in the --output flag. Only
|
||||
available when running falco with a single
|
||||
input.
|
||||
-K, -add-call [Falco only] add the command call call to
|
||||
FastQC data output and FastQC report HTML
|
||||
(this may break the parse of fastqc_data.txt
|
||||
in programs that are very strict about the
|
||||
FastQC output format).
|
||||
|
||||
Help options:
|
||||
-?, -help print this help message
|
||||
-about print about message
|
||||
|
||||
24
src/falco/script.sh
Normal file
24
src/falco/script.sh
Normal file
@@ -0,0 +1,24 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -eo pipefail
|
||||
|
||||
[[ "$par_nogroup" == "false" ]] && unset par_nogroup
|
||||
[[ "$par_bisulfite" == "false" ]] && unset par_bisulfite
|
||||
[[ "$par_reverse_compliment" == "false" ]] && unset par_reverse_compliment
|
||||
|
||||
IFS=";" read -ra input <<< $par_input
|
||||
|
||||
$(which falco) \
|
||||
${par_nogroup:+--nogroup} \
|
||||
${par_contaminants:+--contaminants "$par_contaminants"} \
|
||||
${par_adapters:+--adapters "$par_adapters"} \
|
||||
${par_limits:+--limits "$par_limits"} \
|
||||
${par_subsample:+-subsample $par_subsample} \
|
||||
${par_bisulfite:+-bisulfite} \
|
||||
${par_reverse_compliment:+-reverse-compliment} \
|
||||
${par_outdir:+--outdir "$par_outdir"} \
|
||||
${par_format:+--format "$par_format"} \
|
||||
${par_data_filename:+-data-filename "$par_data_filename"} \
|
||||
${par_report_filename:+-report-filename "$par_report_filename"} \
|
||||
${par_summary_filename:+-summary-filename "$par_summary_filename"} \
|
||||
${input[*]}
|
||||
79
src/falco/test.sh
Normal file
79
src/falco/test.sh
Normal file
@@ -0,0 +1,79 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
echo "> Prepare test data"
|
||||
|
||||
# We use data from this repo: https://github.com/hartwigmedical/testData
|
||||
echo ">> Fetching and preparing test data"
|
||||
fastq1="https://github.com/hartwigmedical/testdata/raw/master/100k_reads_hiseq/TESTX/TESTX_H7YRLADXX_S1_L001_R1_001.fastq.gz"
|
||||
fastq2="https://github.com/hartwigmedical/testdata/raw/master/100k_reads_hiseq/TESTX/TESTX_H7YRLADXX_S1_L001_R2_001.fastq.gz"
|
||||
TMPDIR=$(mktemp -d "$meta_temp_dir/$meta_functionality_name-XXXXXX")
|
||||
function clean_up {
|
||||
[[ -d "$TMPDIR" ]] && rm -r "$TMPDIR"
|
||||
}
|
||||
trap clean_up EXIT
|
||||
|
||||
test_data_dir="$TMPDIR/test_data"
|
||||
|
||||
mkdir $test_data_dir
|
||||
wget -q $fastq1 -O $test_data_dir/R1.fastq.gz
|
||||
wget -q $fastq2 -O $test_data_dir/R2.fastq.gz
|
||||
|
||||
echo ">> Run falco on test data, output to dir"
|
||||
echo ">>> Run falco"
|
||||
$meta_executable \
|
||||
--input "$test_data_dir/R1.fastq.gz;$test_data_dir/R2.fastq.gz" \
|
||||
--outdir "$TMPDIR/output1"
|
||||
|
||||
echo ">>> Checking whether output exists"
|
||||
[ ! -d "$TMPDIR/output1" ] && echo "Output directory not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output1/R1.fastq.gz_fastqc_report.html" ] && echo "Report not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output1/R1.fastq.gz_summary.txt" ] && echo "Summary not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output1/R1.fastq.gz_fastqc_data.txt" ] && echo "fastqc_data not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output1/R2.fastq.gz_fastqc_report.html" ] && echo "Report not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output1/R2.fastq.gz_summary.txt" ] && echo "Summary not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output1/R2.fastq.gz_fastqc_data.txt" ] && echo "fastqc_data not created" && exit 1
|
||||
|
||||
echo ">>> cleanup"
|
||||
rm -rf "$TMPDIR/output1"
|
||||
|
||||
echo ">> Run falco on test data, output to individual files"
|
||||
echo ">>> Please note this is only possible for 1 input fastq file!"
|
||||
echo ">>> Run falco"
|
||||
$meta_executable \
|
||||
--input "$test_data_dir/R1.fastq.gz" \
|
||||
--data_filename "$TMPDIR/output2/data.txt" \
|
||||
--report_filename "$TMPDIR/output2/report.html" \
|
||||
--summary_filename "$TMPDIR/output2/summary.txt" \
|
||||
--outdir "$TMPDIR/output2/"
|
||||
|
||||
echo ">>> Checking whether output exists"
|
||||
[ ! -d "$TMPDIR/output2" ] && echo "Output directory not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output2/report.html" ] && echo "Report not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output2/summary.txt" ] && echo "Summary not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output2/data.txt" ] && echo "fastqc_data not created" && exit 1
|
||||
|
||||
echo ">>> cleanup"
|
||||
rm -rf $TMPDIR/output2/
|
||||
|
||||
echo ">> Run falco on test data, subsample"
|
||||
echo ">>> Run falco"
|
||||
$meta_executable \
|
||||
--input "$test_data_dir/R1.fastq.gz" \
|
||||
--data_filename "$TMPDIR/output3/data.txt" \
|
||||
--report_filename "$TMPDIR/output3/report.html" \
|
||||
--summary_filename "$TMPDIR/output3/summary.txt" \
|
||||
--subsample 100 \
|
||||
--outdir "$TMPDIR/output3"
|
||||
|
||||
echo ">>> Checking whether output exists"
|
||||
[ ! -d "$TMPDIR/output3" ] && echo "Output directory not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output3/report.html" ] && echo "Report not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output3/summary.txt" ] && echo "Summary not created" && exit 1
|
||||
[ ! -f "$TMPDIR/output3/data.txt" ] && echo "fastqc_data not created" && exit 1
|
||||
|
||||
echo ">>> cleanup"
|
||||
rm -rf "$TMPDIR/output3/"
|
||||
|
||||
echo "All tests succeeded!"
|
||||
579
src/fastp/config.vsh.yaml
Normal file
579
src/fastp/config.vsh.yaml
Normal file
@@ -0,0 +1,579 @@
|
||||
name: fastp
|
||||
description: |
|
||||
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...).
|
||||
|
||||
Features:
|
||||
|
||||
- comprehensive quality profiling for both before and after filtering data (quality curves, base contents, KMER, Q20/Q30, GC Ratio, duplication, adapter contents...)
|
||||
- filter out bad reads (too low quality, too short, or too many N...)
|
||||
- cut low quality bases for per read in its 5' and 3' by evaluating the mean quality from a sliding window (like Trimmomatic but faster).
|
||||
- trim all reads in front and tail
|
||||
- cut adapters. Adapter sequences can be automatically detected, which means you don't have to input the adapter sequences to trim them.
|
||||
- correct mismatched base pairs in overlapped regions of paired end reads, if one base is with high quality while the other is with ultra low quality
|
||||
- trim polyG in 3' ends, which is commonly seen in NovaSeq/NextSeq data. Trim polyX in 3' ends to remove unwanted polyX tailing (i.e. polyA tailing for mRNA-Seq data)
|
||||
- preprocess unique molecular identifier (UMI) enabled data, shift UMI to sequence name.
|
||||
- report JSON format result for further interpreting.
|
||||
- visualize quality control and filtering results on a single HTML page (like FASTQC but faster and more informative).
|
||||
- split the output to multiple files (0001.R1.gz, 0002.R1.gz...) to support parallel processing. Two modes can be used, limiting the total split file number, or limitting the lines of each split file.
|
||||
- support long reads (data from PacBio / Nanopore devices).
|
||||
- support reading from STDIN and writing to STDOUT
|
||||
- support interleaved input
|
||||
- support ultra-fast FASTQ-level deduplication
|
||||
keywords: [RNA-Seq, Trimming, Quality control]
|
||||
links:
|
||||
repository: https://github.com/OpenGene/fastp
|
||||
documentation: https://github.com/OpenGene/fastp/blob/master/README.md
|
||||
references:
|
||||
doi: "10.1093/bioinformatics/bty560"
|
||||
license: MIT
|
||||
authors:
|
||||
- __merge__: /src/_authors/robrecht_cannoodt.yaml
|
||||
roles: [ author, maintainer ]
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
description: |
|
||||
`fastp` supports both single-end (SE) and paired-end (PE) input.
|
||||
|
||||
- for SE data, you only have to specify read1 input by `-i` or `--in1`.
|
||||
- for PE data, you should also specify read2 input by `-I` or `--in2`.
|
||||
arguments:
|
||||
- name: --in1
|
||||
alternatives: [-i]
|
||||
type: file
|
||||
description: Input FastQ file. Must be single-end or paired-end R1. Can be gzipped.
|
||||
required: true
|
||||
example: in.R1.fq.gz
|
||||
- name: --in2
|
||||
alternatives: [-I]
|
||||
type: file
|
||||
description: Input FastQ file. Must be paired-end R2. Can be gzipped.
|
||||
required: false
|
||||
example: in.R2.fq.gz
|
||||
- name: Outputs
|
||||
description: |
|
||||
|
||||
- for SE data, you only have to specify read1 output by `-o` or `--out1`.
|
||||
- for PE data, you should also specify read2 output by `-O` or `--out2`.
|
||||
- if you don't specify the output file names, no output files will be written, but the QC will still be done for both data before and after filtering.
|
||||
- the output will be gzip-compressed if its file name ends with `.gz`
|
||||
arguments:
|
||||
- name: --out1
|
||||
alternatives: [-o]
|
||||
type: file
|
||||
description: The single-end or paired-end R1 reads that pass QC. Will be gzipped if its file name ends with `.gz`.
|
||||
required: true
|
||||
example: out.R1.fq.gz
|
||||
direction: output
|
||||
- name: --out2
|
||||
alternatives: [-O]
|
||||
type: file
|
||||
description: The paired-end R2 reads that pass QC. Will be gzipped if its file name ends with `.gz`.
|
||||
required: false
|
||||
example: out.R2.fq.gz
|
||||
direction: output
|
||||
- name: --unpaired1
|
||||
type: file
|
||||
description: Store the reads that `read1` passes filters but its paired `read2` doesn't.
|
||||
required: false
|
||||
example: unpaired.R1.fq.gz
|
||||
direction: output
|
||||
- name: --unpaired2
|
||||
type: file
|
||||
description: Store the reads that `read2` passes filters but its paired `read1` doesn't.
|
||||
required: false
|
||||
example: unpaired.R2.fq.gz
|
||||
direction: output
|
||||
- name: --failed_out
|
||||
type: file
|
||||
description: |
|
||||
Store the reads that fail filters.
|
||||
|
||||
If one read failed and is written to --failed_out, its failure reason will be appended to its read name. For example, failed_quality_filter, failed_too_short etc.
|
||||
For PE data, if unpaired reads are not stored (by giving --unpaired1 or --unpaired2), the failed pair of reads will be put together. If one read passes the filters but its pair doesn't, the failure reason will be paired_read_is_failing.
|
||||
required: false
|
||||
example: failed.fq.gz
|
||||
direction: output
|
||||
- name: --overlapped_out
|
||||
type: file
|
||||
description: |
|
||||
For each read pair, output the overlapped region if it has no any mismatched base.
|
||||
direction: output
|
||||
- name: Report output arguments
|
||||
arguments:
|
||||
- name: --json
|
||||
alternatives: [-j]
|
||||
type: file
|
||||
description: |
|
||||
The json format report file name
|
||||
example: out.json
|
||||
direction: output
|
||||
- name: --html
|
||||
type: file
|
||||
description: |
|
||||
The html format report file name
|
||||
example: out.html
|
||||
direction: output
|
||||
- name: --report_title
|
||||
type: string
|
||||
description: |
|
||||
The title of the html report, default is "fastp report".
|
||||
example: fastp report
|
||||
- name: Adapter trimming
|
||||
description: |
|
||||
Adapter trimming is enabled by default, but you can disable it by `-A` or `--disable_adapter_trimming`. Adapter sequences can be automatically detected for both PE/SE data.
|
||||
|
||||
- For SE data, the adapters are evaluated by analyzing the tails of first ~1M reads. This evaluation may be inacurrate, and you can specify the adapter sequence by `-a` or `--adapter_sequence` option. If adapter sequence is specified, the auto detection for SE data will be disabled.
|
||||
- For PE data, the adapters can be detected by per-read overlap analysis, which seeks for the overlap of each pair of reads. This method is robust and fast, so normally you don't have to input the adapter sequence even you know it. But you can still specify the adapter sequences for read1 by `--adapter_sequence`, and for read2 by `--adapter_sequence_r2`. If `fastp` fails to find an overlap (i.e. due to low quality bases), it will use these sequences to trim adapters for read1 and read2 respectively.
|
||||
- For PE data, the adapter sequence auto-detection is disabled by default since the adapters can be trimmed by overlap analysis. However, you can specify `--detect_adapter_for_pe` to enable it.
|
||||
- For PE data, `fastp` will run a little slower if you specify the sequence adapters or enable adapter auto-detection, but usually result in a slightly cleaner output, since the overlap analysis may fail due to sequencing errors or adapter dimers.
|
||||
- The most widely used adapter is the Illumina TruSeq adapters. If your data is from the TruSeq library, you can add `--adapter_sequence=AGATCGGAAGAGCACACGTCTGAACTCCAGTCA --adapter_sequence_r2=AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT` to your command lines, or enable auto detection for PE data by specifing `detect_adapter_for_pe`.
|
||||
- `fastp` contains some built-in known adapter sequences for better auto-detection. If you want to make some adapters to be a part of the built-in adapters, please file an issue.
|
||||
|
||||
You can also specify --adapter_fasta to give a FASTA file to tell fastp to trim multiple adapters in this FASTA file. Here is a sample of such adapter FASTA file:
|
||||
|
||||
```
|
||||
>Illumina TruSeq Adapter Read 1
|
||||
AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
|
||||
>Illumina TruSeq Adapter Read 2
|
||||
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
|
||||
>polyA
|
||||
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
||||
```
|
||||
|
||||
The adapter sequence in this file should be at least 6bp long, otherwise it will be skipped. And you can give whatever you want to trim, rather than regular sequencing adapters (i.e. polyA).
|
||||
|
||||
`fastp` first trims the auto-detected adapter or the adapter sequences given by `--adapter_sequence | --adapter_sequence_r2`, then trims the adapters given by `--adapter_fasta` one by one.
|
||||
|
||||
The sequence distribution of trimmed adapters can be found at the HTML/JSON reports.
|
||||
arguments:
|
||||
- name: --disable_adapter_trimming
|
||||
alternatives: [-A]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Disable adapter trimming.
|
||||
- name: --detect_adapter_for_pe
|
||||
type: boolean_true
|
||||
description: |
|
||||
By default, the auto-detection for adapter is for SE data input only, turn on this option to enable it for PE data.
|
||||
- name: --adapter_sequence
|
||||
alternatives: [-a]
|
||||
type: string
|
||||
description: |
|
||||
The adapter sequences to be trimmed. For SE data, if not specified, the adapters will be auto-detected. For PE data, this is used if R1/R2 are found not overlapped
|
||||
- name: --adapter_sequence_r2
|
||||
type: string
|
||||
description: |
|
||||
The adapter sequences to be trimmed for R2. This is used for PE data if R1/R2 are found overlapped.
|
||||
- name: --adapter_fasta
|
||||
type: file
|
||||
description: |
|
||||
A FASTA file containing all the adapter sequences to be trimmed. For SE data, if not specified, the adapters will be auto-detected. For PE data, this is used if R1/R2 are found not overlapped.
|
||||
- name: Base trimming
|
||||
arguments:
|
||||
- name: --trim_front1
|
||||
alternatives: [-f]
|
||||
type: integer
|
||||
description: |
|
||||
Trimming how many bases in front for read1, default is 0.
|
||||
example: 0
|
||||
- name: --trim_tail1
|
||||
alternatives: [-t]
|
||||
type: integer
|
||||
description: |
|
||||
Trimming how many bases in tail for read1, default is 0.
|
||||
example: 0
|
||||
- name: --max_len1
|
||||
alternatives: [-b]
|
||||
type: integer
|
||||
min: 0
|
||||
description: |
|
||||
If read1 is longer than max_len1, then trim read1 at its tail to make it as long as max_len1. Default 0 means no limitation.
|
||||
- name: --trim_front2
|
||||
alternatives: [-F]
|
||||
type: integer
|
||||
description: |
|
||||
Trimming how many bases in front for read2, default is 0.
|
||||
example: 0
|
||||
- name: --trim_tail2
|
||||
alternatives: [-T]
|
||||
type: integer
|
||||
description: |
|
||||
Trimming how many bases in tail for read2, default is 0.
|
||||
example: 0
|
||||
- name: --max_len2
|
||||
alternatives: [-B]
|
||||
type: integer
|
||||
min: 0
|
||||
description: |
|
||||
If read2 is longer than max_len2, then trim read2 at its tail to make it as long as max_len2. Default 0 means no limitation.
|
||||
- name: Merging mode
|
||||
description: Allows merging paired-end reads into a single longer read if they are overlapping.
|
||||
arguments:
|
||||
- name: --merge
|
||||
alternatives: [-m]
|
||||
type: boolean_true
|
||||
description: |
|
||||
For paired-end input, merge each pair of reads into a single read if they are overlapped. The merged reads will be written to the file given by --merged_out, the unmerged reads will be written to the files specified by --out1 and --out2. The merging mode is disabled by default.
|
||||
- name: --merged_out
|
||||
type: file
|
||||
description: |
|
||||
In the merging mode, specify the file name to store merged output, or specify --stdout to stream the merged output.
|
||||
direction: output
|
||||
example: merged.fq.gz
|
||||
- name: --include_unmerged
|
||||
type: boolean_true
|
||||
description: |
|
||||
In the merging mode, write the unmerged or unpaired reads to the file specified by --merge. Disabled by default.
|
||||
- name: Additional input arguments
|
||||
description: Affects how the input is read.
|
||||
arguments:
|
||||
- name: --interleaved_in
|
||||
type: boolean_true
|
||||
description: |
|
||||
Indicate that <in1> is an interleaved FASTQ which contains both read1 and read2. Disabled by default.
|
||||
- name: --fix_mgi_id
|
||||
type: boolean_true
|
||||
description: |
|
||||
The MGI FASTQ ID format is not compatible with many BAM operation tools, enable this option to fix it.
|
||||
- name: --phred64
|
||||
alternatives: ["-6"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Indicate the input is using phred64 scoring (it'll be converted to phred33, so the output will still be phred33)
|
||||
- name: Additional output arguments
|
||||
description: Affects how the output is written.
|
||||
arguments:
|
||||
- name: --compression
|
||||
alternatives: ["-z"]
|
||||
type: integer
|
||||
description: |
|
||||
Compression level for gzip output (1 ~ 9). 1 is fastest, 9 is smallest, default is 4.
|
||||
example: 4
|
||||
min: 1
|
||||
max: 9
|
||||
- name: --dont_overwrite
|
||||
type: boolean_true
|
||||
description: |
|
||||
Don't overwrite existing files. Overwritting is allowed by default.
|
||||
- name: Logging arguments
|
||||
arguments:
|
||||
- name: --verbose
|
||||
alternatives: [-V]
|
||||
type: boolean_true
|
||||
description: Output verbose log information (i.e. when every 1M reads are processed).
|
||||
- name: Processing arguments
|
||||
arguments:
|
||||
- name: --reads_to_process
|
||||
type: long
|
||||
description: |
|
||||
Specify how many reads/pairs to be processed. Default 0 means process all reads.
|
||||
example: 1000000
|
||||
min: 0
|
||||
- name: Deduplication arguments
|
||||
arguments:
|
||||
- name: --dedup
|
||||
type: boolean_true
|
||||
description: |
|
||||
Enable deduplication to drop the duplicated reads/pairs
|
||||
- name: --dup_calc_accuracy
|
||||
type: integer
|
||||
description: |
|
||||
Accuracy level to calculate duplication (1~6). Higher level uses more memory (1G, 2G, 4G, 8G, 16G, 24G). Default 1 for no-dedup mode, and 3 for dedup mode.
|
||||
example: 3
|
||||
min: 1
|
||||
max: 6
|
||||
- name: --dont_eval_duplication
|
||||
type: boolean_true
|
||||
description: |
|
||||
Don't evaluate duplication rate to save time and use less memory.
|
||||
- name: PolyG tail trimming arguments
|
||||
arguments:
|
||||
- name: --trim_poly_g
|
||||
alternatives: [-g]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Force polyG tail trimming, by default trimming is automatically enabled for Illumina NextSeq/NovaSeq data
|
||||
- name: --poly_g_min_len
|
||||
type: integer
|
||||
description: |
|
||||
The minimum length to detect polyG in the read tail. 10 by default.
|
||||
example: 10
|
||||
min: 1
|
||||
- name: --disable_trim_poly_g
|
||||
alternatives: [-G]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Disable polyG tail trimming, by default trimming is automatically enabled for Illumina NextSeq/NovaSeq data
|
||||
- name: PolyX tail trimming arguments
|
||||
arguments:
|
||||
- name: --trim_poly_x
|
||||
alternatives: [-x]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Enable polyX trimming in 3' ends.
|
||||
- name: --poly_x_min_len
|
||||
type: integer
|
||||
description: |
|
||||
The minimum length to detect polyX in the read tail. 10 by default.
|
||||
example: 10
|
||||
min: 1
|
||||
- name: Cut arguments
|
||||
arguments:
|
||||
- name: --cut_front
|
||||
alternatives: ["-5"]
|
||||
type: integer
|
||||
description: |
|
||||
Move a sliding window from front (5') to tail, drop the bases in the window if its mean quality < threshold, stop otherwise.
|
||||
- name: --cut_tail
|
||||
alternatives: ["-3"]
|
||||
type: integer
|
||||
description: |
|
||||
Move a sliding window from tail (3') to front, drop the bases in the window if its mean quality < threshold, stop otherwise.
|
||||
- name: --cut_right
|
||||
alternatives: ["-r"]
|
||||
type: integer
|
||||
description: |
|
||||
Move a sliding window from front to tail, if meet one window with mean quality < threshold, drop the bases in the window and the right part, and then stop.
|
||||
- name: --cut_window_size
|
||||
alternatives: ["-W"]
|
||||
type: integer
|
||||
description: |
|
||||
The window size option shared by cut_front, cut_tail or cut_sliding. Range: 1~1000, default: 4.
|
||||
example: 4
|
||||
min: 1
|
||||
- name: --cut_mean_quality
|
||||
alternatives: ["-M"]
|
||||
type: integer
|
||||
description: |
|
||||
The mean quality requirement option shared by cut_front, cut_tail or cut_sliding. Range: 1~36 default: 20 (Q20)
|
||||
example: 20
|
||||
min: 0
|
||||
- name: --cut_front_window_size
|
||||
type: integer
|
||||
description: |
|
||||
The window size option of cut_front, default to cut_window_size if not specified.
|
||||
example: 4
|
||||
min: 1
|
||||
- name: --cut_front_mean_quality
|
||||
type: integer
|
||||
description: |
|
||||
The mean quality requirement option of cut_front, default to cut_mean_quality if not specified.
|
||||
example: 20
|
||||
min: 0
|
||||
- name: --cut_tail_window_size
|
||||
type: integer
|
||||
description: |
|
||||
The window size option of cut_tail, default to cut_window_size if not specified.
|
||||
example: 4
|
||||
min: 1
|
||||
- name: --cut_tail_mean_quality
|
||||
type: integer
|
||||
description: |
|
||||
The mean quality requirement option of cut_tail, default to cut_mean_quality if not specified.
|
||||
example: 20
|
||||
min: 0
|
||||
- name: --cut_right_window_size
|
||||
type: integer
|
||||
description: |
|
||||
The window size option of cut_right, default to cut_window_size if not specified.
|
||||
example: 4
|
||||
min: 1
|
||||
- name: --cut_right_mean_quality
|
||||
type: integer
|
||||
description: |
|
||||
The mean quality requirement option of cut_right, default to cut_mean_quality if not specified.
|
||||
example: 20
|
||||
min: 0
|
||||
- name: Quality filtering arguments
|
||||
arguments:
|
||||
- name: --disable_quality_filtering
|
||||
alternatives: [-Q]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Quality filtering is enabled by default. If this option is specified, quality filtering is disabled.
|
||||
- name: --qualified_quality_phred
|
||||
alternatives: [-q]
|
||||
type: integer
|
||||
description: |
|
||||
The quality value that a base is qualified. Default 15 means phred quality >=Q15 is qualified.
|
||||
example: 15
|
||||
min: 0
|
||||
- name: --unqualified_percent_limit
|
||||
alternatives: [-u]
|
||||
type: integer
|
||||
description: |
|
||||
How many percents of bases are allowed to be unqualified (0~100). Default 40 means 40%.
|
||||
example: 40
|
||||
min: 0
|
||||
max: 100
|
||||
- name: --n_base_limit
|
||||
alternatives: [-n]
|
||||
type: integer
|
||||
description: |
|
||||
If one read's number of N base is >n_base_limit, then this read/pair is discarded. Default is 5.
|
||||
example: 5
|
||||
min: 0
|
||||
- name: --average_qual
|
||||
alternatives: [-e]
|
||||
type: integer
|
||||
description: |
|
||||
If one read's average quality score <avg_qual, then this read/pair is discarded. Default 0 means no requirement.
|
||||
example: 0
|
||||
min: 0
|
||||
- name: Length filtering arguments
|
||||
arguments:
|
||||
- name: --disable_length_filtering
|
||||
alternatives: [-L]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Length filtering is enabled by default. If this option is specified, length filtering is disabled.
|
||||
- name: --length_required
|
||||
alternatives: [-l]
|
||||
type: integer
|
||||
description: |
|
||||
Reads shorter than length_required will be discarded, default is 15.
|
||||
example: 15
|
||||
min: 0
|
||||
- name: --length_limit
|
||||
type: integer
|
||||
description: |
|
||||
Reads longer than length_limit will be discarded, default 0 means no limitation.
|
||||
example: 0
|
||||
min: 0
|
||||
- name: Low complexity filtering arguments
|
||||
arguments:
|
||||
- name: --low_complexity_filter
|
||||
alternatives: [-y]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Enable low complexity filter. The complexity is defined as the percentage of base that is different from its next base (base[i] != base[i+1]).
|
||||
- name: --complexity_threshold
|
||||
alternatives: [-Y]
|
||||
type: integer
|
||||
description: |
|
||||
The threshold for low complexity filter (0~100). Default is 30, which means 30% complexity is required.
|
||||
example: 30
|
||||
min: 0
|
||||
- name: Index filtering arguments
|
||||
arguments:
|
||||
- name: --filter_by_index1
|
||||
type: file
|
||||
description: |
|
||||
Specify a file contains a list of barcodes of index1 to be filtered out, one barcode per line.
|
||||
- name: --filter_by_index2
|
||||
type: file
|
||||
description: |
|
||||
Specify a file contains a list of barcodes of index2 to be filtered out, one barcode per line.
|
||||
- name: --filter_by_index_threshold
|
||||
type: integer
|
||||
description: |
|
||||
The allowed difference of index barcode for index filtering, default 0 means completely identical.
|
||||
example: 0
|
||||
min: 0
|
||||
- name: Overlapped region correction
|
||||
arguments:
|
||||
- type: boolean_true
|
||||
name: --correction
|
||||
alternatives: [-c]
|
||||
description: |
|
||||
Enable base correction in overlapped regions (only for PE data), default is disabled.
|
||||
- name: --overlap_len_require
|
||||
type: integer
|
||||
description: |
|
||||
The minimum length to detect overlapped region of PE reads. This will affect overlap analysis based PE merge, adapter trimming and correction. 30 by default.
|
||||
example: 30
|
||||
min: 0
|
||||
- name: --overlap_diff_limit
|
||||
type: integer
|
||||
description: |
|
||||
The maximum number of mismatched bases to detect overlapped region of PE reads. This will affect overlap analysis based PE merge, adapter trimming and correction. 5 by default.
|
||||
example: 5
|
||||
min: 0
|
||||
- name: --overlap_diff_percent_limit
|
||||
type: integer
|
||||
description: |
|
||||
The maximum percentage of mismatched bases to detect overlapped region of PE reads. This will affect overlap analysis based PE merge, adapter trimming and correction. Default 20 means 20%.
|
||||
example: 20
|
||||
min: 0
|
||||
max: 100
|
||||
- name: UMI arguments
|
||||
arguments:
|
||||
- name: --umi
|
||||
alternatives: [-U]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Enable unique molecular identifier (UMI) preprocessing.
|
||||
- name: --umi_loc
|
||||
type: string
|
||||
description: |
|
||||
Specify the location of UMI, can be (index1/index2/read1/read2/per_index/per_read, default is none.
|
||||
choices: [index1, index2, read1, read2, per_index, per_read]
|
||||
- name: --umi_len
|
||||
type: integer
|
||||
description: |
|
||||
If the UMI is in read1/read2, its length should be provided.
|
||||
example: 0
|
||||
min: 0
|
||||
- name: --umi_prefix
|
||||
type: string
|
||||
description: |
|
||||
If specified, an underline will be used to connect prefix and UMI (i.e. prefix=UMI, UMI=AATTCG, final=UMI_AATTCG). No prefix by default.
|
||||
- name: --umi_skip
|
||||
type: integer
|
||||
description: |
|
||||
If the UMI is in read1/read2, fastp can skip several bases following UMI, default is 0.
|
||||
example: 0
|
||||
min: 0
|
||||
- name: --umi_delim
|
||||
type: string
|
||||
description: |
|
||||
If the UMI is in index1/index2, fastp can use a delimiter to separate UMI from the read sequence, default is none.
|
||||
- name: Overrepresentation analysis arguments
|
||||
arguments:
|
||||
- name: --overrepresentation_analysis
|
||||
alternatives: [-p]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Enable overrepresentation analysis.
|
||||
- name: --overrepresentation_sampling
|
||||
type: integer
|
||||
description: |
|
||||
One in (--overrepresentation_sampling) reads will be computed for overrepresentation analysis (1~10000), smaller is slower, default is 20.
|
||||
example: 20
|
||||
min: 1
|
||||
# # would need to set all outputs to multiple: true
|
||||
# - name: Split arguments
|
||||
# arguments:
|
||||
# - name: --split
|
||||
# alternatives: [-s]
|
||||
# type: boolean_true
|
||||
# description: |
|
||||
# Split output by limiting total split file number with this option (2~999), a sequential number prefix will be added to output name ( 0001.out.fq, 0002.out.fq...), disabled by default.
|
||||
# - name: --split_by_lines
|
||||
# alternatives: [-S]
|
||||
# type: long
|
||||
# description: |
|
||||
# Split output by limiting lines of each file with this option(>=1000), a sequential number prefix will be added to output name ( 0001.out.fq, 0002.out.fq...), disabled by default.
|
||||
# - name: --split_prefix_digits
|
||||
# type: integer
|
||||
# description: |
|
||||
# The digits for the sequential number padding (1~10), default is 4, so the filename will be padded as 0001.xxx, 0 to disable padding.
|
||||
# example: 4
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- type: file
|
||||
path: test_data
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/fastp:0.23.4--hadf994f_2
|
||||
setup:
|
||||
- type: docker
|
||||
run: |
|
||||
fastp --version 2>&1 | sed 's# #: "#;s#$#"#' > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
93
src/fastp/help.txt
Normal file
93
src/fastp/help.txt
Normal file
@@ -0,0 +1,93 @@
|
||||
```bash
|
||||
fastp --help
|
||||
```
|
||||
|
||||
usage: fastp [options] ...
|
||||
options:
|
||||
-i, --in1 read1 input file name (string [=])
|
||||
-o, --out1 read1 output file name (string [=])
|
||||
-I, --in2 read2 input file name (string [=])
|
||||
-O, --out2 read2 output file name (string [=])
|
||||
--unpaired1 for PE input, if read1 passed QC but read2 not, it will be written to unpaired1. Default is to discard it. (string [=])
|
||||
--unpaired2 for PE input, if read2 passed QC but read1 not, it will be written to unpaired2. If --unpaired2 is same as --unpaired1 (default mode), both unpaired reads will be written to this same file. (string [=])
|
||||
--overlapped_out for each read pair, output the overlapped region if it has no any mismatched base. (string [=])
|
||||
--failed_out specify the file to store reads that cannot pass the filters. (string [=])
|
||||
-m, --merge for paired-end input, merge each pair of reads into a single read if they are overlapped. The merged reads will be written to the file given by --merged_out, the unmerged reads will be written to the files specified by --out1 and --out2. The merging mode is disabled by default.
|
||||
--merged_out in the merging mode, specify the file name to store merged output, or specify --stdout to stream the merged output (string [=])
|
||||
--include_unmerged in the merging mode, write the unmerged or unpaired reads to the file specified by --merge. Disabled by default.
|
||||
-6, --phred64 indicate the input is using phred64 scoring (it'll be converted to phred33, so the output will still be phred33)
|
||||
-z, --compression compression level for gzip output (1 ~ 9). 1 is fastest, 9 is smallest, default is 4. (int [=4])
|
||||
--stdin input from STDIN. If the STDIN is interleaved paired-end FASTQ, please also add --interleaved_in.
|
||||
--stdout stream passing-filters reads to STDOUT. This option will result in interleaved FASTQ output for paired-end output. Disabled by default.
|
||||
--interleaved_in indicate that <in1> is an interleaved FASTQ which contains both read1 and read2. Disabled by default.
|
||||
--reads_to_process specify how many reads/pairs to be processed. Default 0 means process all reads. (int [=0])
|
||||
--dont_overwrite don't overwrite existing files. Overwritting is allowed by default.
|
||||
--fix_mgi_id the MGI FASTQ ID format is not compatible with many BAM operation tools, enable this option to fix it.
|
||||
-V, --verbose output verbose log information (i.e. when every 1M reads are processed).
|
||||
-A, --disable_adapter_trimming adapter trimming is enabled by default. If this option is specified, adapter trimming is disabled
|
||||
-a, --adapter_sequence the adapter for read1. For SE data, if not specified, the adapter will be auto-detected. For PE data, this is used if R1/R2 are found not overlapped. (string [=auto])
|
||||
--adapter_sequence_r2 the adapter for read2 (PE data only). This is used if R1/R2 are found not overlapped. If not specified, it will be the same as <adapter_sequence> (string [=auto])
|
||||
--adapter_fasta specify a FASTA file to trim both read1 and read2 (if PE) by all the sequences in this FASTA file (string [=])
|
||||
--detect_adapter_for_pe by default, the auto-detection for adapter is for SE data input only, turn on this option to enable it for PE data.
|
||||
-f, --trim_front1 trimming how many bases in front for read1, default is 0 (int [=0])
|
||||
-t, --trim_tail1 trimming how many bases in tail for read1, default is 0 (int [=0])
|
||||
-b, --max_len1 if read1 is longer than max_len1, then trim read1 at its tail to make it as long as max_len1. Default 0 means no limitation (int [=0])
|
||||
-F, --trim_front2 trimming how many bases in front for read2. If it's not specified, it will follow read1's settings (int [=0])
|
||||
-T, --trim_tail2 trimming how many bases in tail for read2. If it's not specified, it will follow read1's settings (int [=0])
|
||||
-B, --max_len2 if read2 is longer than max_len2, then trim read2 at its tail to make it as long as max_len2. Default 0 means no limitation. If it's not specified, it will follow read1's settings (int [=0])
|
||||
-D, --dedup enable deduplication to drop the duplicated reads/pairs
|
||||
--dup_calc_accuracy accuracy level to calculate duplication (1~6), higher level uses more memory (1G, 2G, 4G, 8G, 16G, 24G). Default 1 for no-dedup mode, and 3 for dedup mode. (int [=0])
|
||||
--dont_eval_duplication don't evaluate duplication rate to save time and use less memory.
|
||||
-g, --trim_poly_g force polyG tail trimming, by default trimming is automatically enabled for Illumina NextSeq/NovaSeq data
|
||||
--poly_g_min_len the minimum length to detect polyG in the read tail. 10 by default. (int [=10])
|
||||
-G, --disable_trim_poly_g disable polyG tail trimming, by default trimming is automatically enabled for Illumina NextSeq/NovaSeq data
|
||||
-x, --trim_poly_x enable polyX trimming in 3' ends.
|
||||
--poly_x_min_len the minimum length to detect polyX in the read tail. 10 by default. (int [=10])
|
||||
-5, --cut_front move a sliding window from front (5') to tail, drop the bases in the window if its mean quality < threshold, stop otherwise.
|
||||
-3, --cut_tail move a sliding window from tail (3') to front, drop the bases in the window if its mean quality < threshold, stop otherwise.
|
||||
-r, --cut_right move a sliding window from front to tail, if meet one window with mean quality < threshold, drop the bases in the window and the right part, and then stop.
|
||||
-W, --cut_window_size the window size option shared by cut_front, cut_tail or cut_sliding. Range: 1~1000, default: 4 (int [=4])
|
||||
-M, --cut_mean_quality the mean quality requirement option shared by cut_front, cut_tail or cut_sliding. Range: 1~36 default: 20 (Q20) (int [=20])
|
||||
--cut_front_window_size the window size option of cut_front, default to cut_window_size if not specified (int [=4])
|
||||
--cut_front_mean_quality the mean quality requirement option for cut_front, default to cut_mean_quality if not specified (int [=20])
|
||||
--cut_tail_window_size the window size option of cut_tail, default to cut_window_size if not specified (int [=4])
|
||||
--cut_tail_mean_quality the mean quality requirement option for cut_tail, default to cut_mean_quality if not specified (int [=20])
|
||||
--cut_right_window_size the window size option of cut_right, default to cut_window_size if not specified (int [=4])
|
||||
--cut_right_mean_quality the mean quality requirement option for cut_right, default to cut_mean_quality if not specified (int [=20])
|
||||
-Q, --disable_quality_filtering quality filtering is enabled by default. If this option is specified, quality filtering is disabled
|
||||
-q, --qualified_quality_phred the quality value that a base is qualified. Default 15 means phred quality >=Q15 is qualified. (int [=15])
|
||||
-u, --unqualified_percent_limit how many percents of bases are allowed to be unqualified (0~100). Default 40 means 40% (int [=40])
|
||||
-n, --n_base_limit if one read's number of N base is >n_base_limit, then this read/pair is discarded. Default is 5 (int [=5])
|
||||
-e, --average_qual if one read's average quality score <avg_qual, then this read/pair is discarded. Default 0 means no requirement (int [=0])
|
||||
-L, --disable_length_filtering length filtering is enabled by default. If this option is specified, length filtering is disabled
|
||||
-l, --length_required reads shorter than length_required will be discarded, default is 15. (int [=15])
|
||||
--length_limit reads longer than length_limit will be discarded, default 0 means no limitation. (int [=0])
|
||||
-y, --low_complexity_filter enable low complexity filter. The complexity is defined as the percentage of base that is different from its next base (base[i] != base[i+1]).
|
||||
-Y, --complexity_threshold the threshold for low complexity filter (0~100). Default is 30, which means 30% complexity is required. (int [=30])
|
||||
--filter_by_index1 specify a file contains a list of barcodes of index1 to be filtered out, one barcode per line (string [=])
|
||||
--filter_by_index2 specify a file contains a list of barcodes of index2 to be filtered out, one barcode per line (string [=])
|
||||
--filter_by_index_threshold the allowed difference of index barcode for index filtering, default 0 means completely identical. (int [=0])
|
||||
-c, --correction enable base correction in overlapped regions (only for PE data), default is disabled
|
||||
--overlap_len_require the minimum length to detect overlapped region of PE reads. This will affect overlap analysis based PE merge, adapter trimming and correction. 30 by default. (int [=30])
|
||||
--overlap_diff_limit the maximum number of mismatched bases to detect overlapped region of PE reads. This will affect overlap analysis based PE merge, adapter trimming and correction. 5 by default. (int [=5])
|
||||
--overlap_diff_percent_limit the maximum percentage of mismatched bases to detect overlapped region of PE reads. This will affect overlap analysis based PE merge, adapter trimming and correction. Default 20 means 20%. (int [=20])
|
||||
-U, --umi enable unique molecular identifier (UMI) preprocessing
|
||||
--umi_loc specify the location of UMI, can be (index1/index2/read1/read2/per_index/per_read, default is none (string [=])
|
||||
--umi_len if the UMI is in read1/read2, its length should be provided (int [=0])
|
||||
--umi_prefix if specified, an underline will be used to connect prefix and UMI (i.e. prefix=UMI, UMI=AATTCG, final=UMI_AATTCG). No prefix by default (string [=])
|
||||
--umi_skip if the UMI is in read1/read2, fastp can skip several bases following UMI, default is 0 (int [=0])
|
||||
--umi_delim delimiter to use between the read name and the UMI, default is : (string [=:])
|
||||
-p, --overrepresentation_analysis enable overrepresented sequence analysis.
|
||||
-P, --overrepresentation_sampling one in (--overrepresentation_sampling) reads will be computed for overrepresentation analysis (1~10000), smaller is slower, default is 20. (int [=20])
|
||||
-j, --json the json format report file name (string [=fastp.json])
|
||||
-h, --html the html format report file name (string [=fastp.html])
|
||||
-R, --report_title should be quoted with ' or ", default is "fastp report" (string [=fastp report])
|
||||
-w, --thread worker thread number, default is 3 (int [=3])
|
||||
-s, --split split output by limiting total split file number with this option (2~999), a sequential number prefix will be added to output name ( 0001.out.fq, 0002.out.fq...), disabled by default (int [=0])
|
||||
-S, --split_by_lines split output by limiting lines of each file with this option(>=1000), a sequential number prefix will be added to output name ( 0001.out.fq, 0002.out.fq...), disabled by default (long [=0])
|
||||
-d, --split_prefix_digits the digits for the sequential number padding (1~10), default is 4, so the filename will be padded as 0001.xxx, 0 to disable padding (int [=4])
|
||||
--cut_by_quality5 DEPRECATED, use --cut_front instead.
|
||||
--cut_by_quality3 DEPRECATED, use --cut_tail instead.
|
||||
--cut_by_quality_aggressive DEPRECATED, use --cut_right instead.
|
||||
--discard_unmerged DEPRECATED, no effect now, see the introduction for merging.
|
||||
-?, --help print this message
|
||||
105
src/fastp/script.sh
Normal file
105
src/fastp/script.sh
Normal file
@@ -0,0 +1,105 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
# disable flags
|
||||
[[ "$par_disable_adapter_trimming" == "false" ]] && unset par_disable_adapter_trimming
|
||||
[[ "$par_detect_adapter_for_pe" == "false" ]] && unset par_detect_adapter_for_pe
|
||||
[[ "$par_merge" == "false" ]] && unset par_merge
|
||||
[[ "$par_include_unmerged" == "false" ]] && unset par_include_unmerged
|
||||
[[ "$par_interleaved_in" == "false" ]] && unset par_interleaved_in
|
||||
[[ "$par_fix_mgi_id" == "false" ]] && unset par_fix_mgi_id
|
||||
[[ "$par_phred64" == "false" ]] && unset par_phred64
|
||||
[[ "$par_dont_overwrite" == "false" ]] && unset par_dont_overwrite
|
||||
[[ "$par_verbose" == "false" ]] && unset par_verbose
|
||||
[[ "$par_dedup" == "false" ]] && unset par_dedup
|
||||
[[ "$par_dont_eval_duplication" == "false" ]] && unset par_dont_eval_duplication
|
||||
[[ "$par_trim_poly_g" == "false" ]] && unset par_trim_poly_g
|
||||
[[ "$par_disable_trim_poly_g" == "false" ]] && unset par_disable_trim_poly_g
|
||||
[[ "$par_trim_poly_x" == "false" ]] && unset par_trim_poly_x
|
||||
[[ "$par_disable_quality_filtering" == "false" ]] && unset par_disable_quality_filtering
|
||||
[[ "$par_disable_length_filtering" == "false" ]] && unset par_disable_length_filtering
|
||||
[[ "$par_low_complexity_filter" == "false" ]] && unset par_low_complexity_filter
|
||||
[[ "$par_umi" == "false" ]] && unset par_umi
|
||||
[[ "$par_overrepresentation_analysis" == "false" ]] && unset par_overrepresentation_analysis
|
||||
|
||||
# run command
|
||||
fastp \
|
||||
-i "$par_in1" \
|
||||
-o "$par_out1" \
|
||||
${par_in2:+--in2 "${par_in2}"} \
|
||||
${par_out2:+--out2 "${par_out2}"} \
|
||||
${par_unpaired1:+--unpaired1 "${par_unpaired1}"} \
|
||||
${par_unpaired2:+--unpaired2 "${par_unpaired2}"} \
|
||||
${par_failed_out:+--failed_out "${par_failed_out}"} \
|
||||
${par_overlapped_out:+--overlapped_out "${par_overlapped_out}"} \
|
||||
${par_json:+--json "${par_json}"} \
|
||||
${par_html:+--html "${par_html}"} \
|
||||
${par_report_title:+--report_title "${par_report_title}"} \
|
||||
${par_disable_adapter_trimming:+--disable_adapter_trimming} \
|
||||
${par_detect_adapter_for_pe:+--detect_adapter_for_pe} \
|
||||
${par_adapter_sequence:+--adapter_sequence "${par_adapter_sequence}"} \
|
||||
${par_adapter_sequence_r2:+--adapter_sequence_r2 "${par_adapter_sequence_r2}"} \
|
||||
${par_adapter_fasta:+--adapter_fasta "${par_adapter_fasta}"} \
|
||||
${par_trim_front1:+--trim_front1 "${par_trim_front1}"} \
|
||||
${par_trim_tail1:+--trim_tail1 "${par_trim_tail1}"} \
|
||||
${par_max_len1:+--max_len1 "${par_max_len1}"} \
|
||||
${par_trim_front2:+--trim_front2 "${par_trim_front2}"} \
|
||||
${par_trim_tail2:+--trim_tail2 "${par_trim_tail2}"} \
|
||||
${par_max_len2:+--max_len2 "${par_max_len2}"} \
|
||||
${par_merge:+--merge} \
|
||||
${par_merged_out:+--merged_out "${par_merged_out}"} \
|
||||
${par_include_unmerged:+--include_unmerged} \
|
||||
${par_interleaved_in:+--interleaved_in} \
|
||||
${par_fix_mgi_id:+--fix_mgi_id} \
|
||||
${par_phred64:+--phred64} \
|
||||
${par_compression:+--compression "${par_compression}"} \
|
||||
${par_dont_overwrite:+--dont_overwrite} \
|
||||
${par_verbose:+--verbose} \
|
||||
${par_reads_to_process:+--reads_to_process "${par_reads_to_process}"} \
|
||||
${par_dedup:+--dedup} \
|
||||
${par_dup_calc_accuracy:+--dup_calc_accuracy "${par_dup_calc_accuracy}"} \
|
||||
${par_dont_eval_duplication:+--dont_eval_duplication} \
|
||||
${par_trim_poly_g:+--trim_poly_g} \
|
||||
${par_poly_g_min_len:+--poly_g_min_len "${par_poly_g_min_len}"} \
|
||||
${par_disable_trim_poly_g:+--disable_trim_poly_g} \
|
||||
${par_trim_poly_x:+--trim_poly_x} \
|
||||
${par_poly_x_min_len:+--poly_x_min_len "${par_poly_x_min_len}"} \
|
||||
${par_cut_front:+--cut_front "${par_cut_front}"} \
|
||||
${par_cut_tail:+--cut_tail "${par_cut_tail}"} \
|
||||
${par_cut_right:+--cut_right "${par_cut_right}"} \
|
||||
${par_cut_window_size:+--cut_window_size "${par_cut_window_size}"} \
|
||||
${par_cut_mean_quality:+--cut_mean_quality "${par_cut_mean_quality}"} \
|
||||
${par_cut_front_window_size:+--cut_front_window_size "${par_cut_front_window_size}"} \
|
||||
${par_cut_front_mean_quality:+--cut_front_mean_quality "${par_cut_front_mean_quality}"} \
|
||||
${par_cut_tail_window_size:+--cut_tail_window_size "${par_cut_tail_window_size}"} \
|
||||
${par_cut_tail_mean_quality:+--cut_tail_mean_quality "${par_cut_tail_mean_quality}"} \
|
||||
${par_cut_right_window_size:+--cut_right_window_size "${par_cut_right_window_size}"} \
|
||||
${par_cut_right_mean_quality:+--cut_right_mean_quality "${par_cut_right_mean_quality}"} \
|
||||
${par_disable_quality_filtering:+--disable_quality_filtering} \
|
||||
${par_qualified_quality_phred:+--qualified_quality_phred "${par_qualified_quality_phred}"} \
|
||||
${par_unqualified_percent_limit:+--unqualified_percent_limit "${par_unqualified_percent_limit}"} \
|
||||
${par_n_base_limit:+--n_base_limit "${par_n_base_limit}"} \
|
||||
${par_average_qual:+--average_qual "${par_average_qual}"} \
|
||||
${par_disable_length_filtering:+--disable_length_filtering} \
|
||||
${par_length_required:+--length_required "${par_length_required}"} \
|
||||
${par_length_limit:+--length_limit "${par_length_limit}"} \
|
||||
${par_low_complexity_filter:+--low_complexity_filter} \
|
||||
${par_complexity_threshold:+--complexity_threshold "${par_complexity_threshold}"} \
|
||||
${par_filter_by_index1:+--filter_by_index1 "${par_filter_by_index1}"} \
|
||||
${par_filter_by_index2:+--filter_by_index2 "${par_filter_by_index2}"} \
|
||||
${par_filter_by_index_threshold:+--filter_by_index_threshold "${par_filter_by_index_threshold}"} \
|
||||
${par_correction:+--correction} \
|
||||
${par_overlap_len_require:+--overlap_len_require "${par_overlap_len_require}"} \
|
||||
${par_overlap_diff_limit:+--overlap_diff_limit "${par_overlap_diff_limit}"} \
|
||||
${par_overlap_diff_percent_limit:+--overlap_diff_percent_limit "${par_overlap_diff_percent_limit}"} \
|
||||
${par_umi:+--umi} \
|
||||
${par_umi_loc:+--umi_loc "${par_umi_loc}"} \
|
||||
${par_umi_len:+--umi_len "${par_umi_len}"} \
|
||||
${par_umi_prefix:+--umi_prefix "${par_umi_prefix}"} \
|
||||
${par_umi_skip:+--umi_skip "${par_umi_skip}"} \
|
||||
${par_umi_delim:+--umi_delim "${par_umi_delim}"} \
|
||||
${par_overrepresentation_analysis:+--overrepresentation_analysis} \
|
||||
${par_overrepresentation_sampling:+--overrepresentation_sampling "${par_overrepresentation_sampling}"} \
|
||||
${meta_cpus:+--thread "${meta_cpus}"}
|
||||
74
src/fastp/test.sh
Normal file
74
src/fastp/test.sh
Normal file
@@ -0,0 +1,74 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
## VIASH START
|
||||
meta_executable="target/docker/fastp/fastp"
|
||||
meta_resources_dir="src/fastp"
|
||||
## VIASH END
|
||||
|
||||
#########################################################################################
|
||||
mkdir fastp_se
|
||||
cd fastp_se
|
||||
|
||||
echo "> Run fastp on SE"
|
||||
"$meta_executable" \
|
||||
--in1 "$meta_resources_dir/test_data/se/a.fastq" \
|
||||
--out1 "trimmed.fastq" \
|
||||
--failed_out "failed.fastq" \
|
||||
--json "report.json" \
|
||||
--html "report.html" \
|
||||
--adapter_sequence ACGGCTAGCTA
|
||||
|
||||
echo ">> Check if output exists"
|
||||
[ ! -f "trimmed.fastq" ] && echo ">> trimmed.fastq does not exist" && exit 1
|
||||
[ ! -f "failed.fastq" ] && echo ">> failed.fastq does not exist" && exit 1
|
||||
[ ! -f "report.json" ] && echo ">> report.json does not exist" && exit 1
|
||||
[ ! -f "report.html" ] && echo ">> report.html does not exist" && exit 1
|
||||
|
||||
#########################################################################################
|
||||
cd ..
|
||||
mkdir fastp_pe_minimal
|
||||
cd fastp_pe_minimal
|
||||
|
||||
echo ">> Run fastp on PE with minimal parameters"
|
||||
"$meta_executable" \
|
||||
--in1 "$meta_resources_dir/test_data/pe/a.1.fastq" \
|
||||
--in2 "$meta_resources_dir/test_data/pe/a.2.fastq" \
|
||||
--out1 "trimmed_1.fastq" \
|
||||
--out2 "trimmed_2.fastq"
|
||||
|
||||
echo ">> Check if output exists"
|
||||
[ ! -f "trimmed_1.fastq" ] && echo ">> trimmed_1.fastq does not exist" && exit 1
|
||||
[ ! -f "trimmed_2.fastq" ] && echo ">> trimmed_2.fastq does not exist" && exit 1
|
||||
|
||||
#########################################################################################
|
||||
cd ..
|
||||
mkdir fastp_pe_many
|
||||
cd fastp_pe_many
|
||||
|
||||
echo ">> Run fastp on PE with many parameters"
|
||||
"$meta_executable" \
|
||||
--in1 "$meta_resources_dir/test_data/pe/a.1.fastq" \
|
||||
--in2 "$meta_resources_dir/test_data/pe/a.2.fastq" \
|
||||
--out1 "trimmed_1.fastq" \
|
||||
--out2 "trimmed_2.fastq" \
|
||||
--failed_out "failed.fastq" \
|
||||
--json "report.json" \
|
||||
--html "report.html" \
|
||||
--adapter_sequence ACGGCTAGCTA \
|
||||
--adapter_sequence_r2 AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC \
|
||||
--merge \
|
||||
--merged_out "merged.fastq"
|
||||
|
||||
echo ">> Check if output exists"
|
||||
[ ! -f "trimmed_1.fastq" ] && echo ">> trimmed_1.fastq does not exist" && exit 1
|
||||
[ ! -f "trimmed_2.fastq" ] && echo ">> trimmed_2.fastq does not exist" && exit 1
|
||||
[ ! -f "failed.fastq" ] && echo ">> failed.fastq does not exist" && exit 1
|
||||
[ ! -f "report.json" ] && echo ">> report.json does not exist" && exit 1
|
||||
[ ! -f "report.html" ] && echo ">> report.html does not exist" && exit 1
|
||||
[ ! -f "merged.fastq" ] && echo ">> merged.fastq does not exist" && exit 1
|
||||
|
||||
#########################################################################################
|
||||
|
||||
echo "> Test successful"
|
||||
4
src/fastp/test_data/pe/a.1.fastq
Normal file
4
src/fastp/test_data/pe/a.1.fastq
Normal file
@@ -0,0 +1,4 @@
|
||||
@1
|
||||
ACGGCAT
|
||||
+
|
||||
!!!!!!!
|
||||
4
src/fastp/test_data/pe/a.2.fastq
Normal file
4
src/fastp/test_data/pe/a.2.fastq
Normal file
@@ -0,0 +1,4 @@
|
||||
@1
|
||||
ACGGCAT
|
||||
+
|
||||
!!!!!!!
|
||||
10
src/fastp/test_data/script.sh
Executable file
10
src/fastp/test_data/script.sh
Executable file
@@ -0,0 +1,10 @@
|
||||
# fastp test data
|
||||
|
||||
# Test data was obtained from https://github.com/snakemake/snakemake-wrappers/tree/master/bio/fastp/test
|
||||
|
||||
if [ ! -d /tmp/snakemake-wrappers ]; then
|
||||
git clone --depth 1 --single-branch --branch master https://github.com/snakemake/snakemake-wrappers /tmp/snakemake-wrappers
|
||||
fi
|
||||
|
||||
cp -r /tmp/snakemake-wrappers/bio/fastp/test/reads/* src/fastp/test_data
|
||||
|
||||
4
src/fastp/test_data/se/a.fastq
Normal file
4
src/fastp/test_data/se/a.fastq
Normal file
@@ -0,0 +1,4 @@
|
||||
@1
|
||||
ACGGCAT
|
||||
+
|
||||
!!!!!!!
|
||||
338
src/featurecounts/config.vsh.yaml
Normal file
338
src/featurecounts/config.vsh.yaml
Normal file
@@ -0,0 +1,338 @@
|
||||
name: featurecounts
|
||||
description: |
|
||||
featureCounts is a read summarization program for counting reads generated from either RNA or genomic DNA sequencing experiments by implementing highly efficient chromosome hashing and feature blocking techniques. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications.
|
||||
keywords: ["Read counting", "Genomic features"]
|
||||
links:
|
||||
homepage: https://subread.sourceforge.net/
|
||||
documentation: https://subread.sourceforge.net/SubreadUsersGuide.pdf
|
||||
repository: https://github.com/ShiLab-Bioinformatics/subread
|
||||
references:
|
||||
doi: "10.1093/bioinformatics/btt656"
|
||||
license: GPL-3.0
|
||||
requirements:
|
||||
commands: [ featureCounts ]
|
||||
authors:
|
||||
- __merge__: /src/_authors/sai_nirmayi_yasa.yaml
|
||||
roles: [ author, maintainer ]
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments:
|
||||
- name: --annotation
|
||||
alternatives: ["-a"]
|
||||
type: file
|
||||
description: |
|
||||
Name of an annotation file. GTF/GFF format by default. See '--format' option for more format information.
|
||||
required: true
|
||||
example: annotation.gtf
|
||||
- name: --input
|
||||
alternatives: ["-i"]
|
||||
type: file
|
||||
multiple: true
|
||||
description: |
|
||||
A list of SAM or BAM format files separated by semi-colon (;). They can be either name or location sorted. Location-sorted paired-end reads are automatically sorted by read names.
|
||||
required: true
|
||||
example: input_file1.bam
|
||||
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- name: --counts
|
||||
alternatives: ["-o"]
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Name of output file including read counts in tab delimited format.
|
||||
required: true
|
||||
example: features.tsv
|
||||
- name: --summary
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Summary statistics of counting results in tab delimited format.
|
||||
required: false
|
||||
example: summary.tsv
|
||||
- name: --junctions
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Count number of reads supporting each exon-exon junction. Junctions were identified from those exon-spanning reads in the input (containing 'N' in CIGAR string).
|
||||
example: junctions.txt
|
||||
required: false
|
||||
|
||||
- name: Annotation
|
||||
arguments:
|
||||
- name: --format
|
||||
alternatives: ["-F"]
|
||||
type: string
|
||||
description: |
|
||||
Specify format of the provided annotation file. Acceptable formats include 'GTF' (or compatible GFF format) and 'SAF'. 'GTF' by default.
|
||||
choices: [GTF, GFF, SAF]
|
||||
example: "GTF"
|
||||
required: false
|
||||
- name: --feature_type
|
||||
alternatives: ["-t"]
|
||||
type: string
|
||||
description: |
|
||||
Specify feature type(s) in a GTF annotation. If multiple types are provided, they should be separated by ';' with no space in between. 'exon' by default. Rows in the annotation with a matched feature will be extracted and used for read mapping.
|
||||
example: "exon"
|
||||
required: false
|
||||
multiple: true
|
||||
- name: --attribute_type
|
||||
alternatives: ["-g"]
|
||||
type: string
|
||||
description: |
|
||||
Specify attribute type in GTF annotation. 'gene_id' by default. Meta-features used for read counting will be extracted from annotation using the provided value.
|
||||
example: "gene_id"
|
||||
required: false
|
||||
- name: --extra_attributes
|
||||
type: string
|
||||
description: |
|
||||
Extract extra attribute types from the provided GTF annotation and include them in the counting output. These attribute types will not be used to group features. If more than one attribute type is provided they should be separated by semicolon (;).
|
||||
required: false
|
||||
multiple: true
|
||||
- name: --chrom_alias
|
||||
alternatives: ["-A"]
|
||||
type: file
|
||||
description: |
|
||||
Provide a chromosome name alias file to match chr names in annotation with those in the reads. This should be a two-column comma-delimited text file. Its first column should include chr names in the annotation and its second column should include chr names in the reads. Chr names are case sensitive. No column header should be included in the file.
|
||||
required: false
|
||||
example: chrom_alias.csv
|
||||
|
||||
- name: Level of summarization
|
||||
arguments:
|
||||
- name: --feature_level
|
||||
alternatives: ["-f"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Perform read counting at feature level (eg. counting reads for exons rather than genes).
|
||||
|
||||
- name: Overlap between reads and features
|
||||
arguments:
|
||||
- name: --overlapping
|
||||
alternatives: ["-O"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Assign reads to all their overlapping meta-features (or features if '--feature_level' is specified).
|
||||
- name: --min_overlap
|
||||
type: integer
|
||||
description: |
|
||||
Minimum number of overlapping bases in a read that is required for read assignment. 1 by default. Number of overlapping bases is counted from both reads if paired end. If a negative value is provided, then a gap of up to specified size will be allowed between read and the feature that the read is assigned to.
|
||||
required: false
|
||||
example: 1
|
||||
- name: --frac_overlap
|
||||
type: double
|
||||
description: |
|
||||
Minimum fraction of overlapping bases in a read that is required for read assignment. Value should be within range [0,1]. 0 by default. Number of overlapping bases is counted from both reads if paired end. Both this option and '--min_overlap' option need to be satisfied for read assignment.
|
||||
required: false
|
||||
min: 0
|
||||
max: 1
|
||||
example: 0
|
||||
- name: --frac_overlap_feature
|
||||
type: double
|
||||
description: |
|
||||
Minimum fraction of overlapping bases in a feature that is required for read assignment. Value should be within range [0,1]. 0 by default.
|
||||
required: false
|
||||
min: 0
|
||||
max: 1
|
||||
example: 0
|
||||
- name: --largest_overlap
|
||||
type: boolean_true
|
||||
description: |
|
||||
Assign reads to a meta-feature/feature that has the largest number of overlapping bases.
|
||||
- name: --non_overlap
|
||||
type: integer
|
||||
description: |
|
||||
Maximum number of non-overlapping bases in a read (or a read pair) that is allowed when being assigned to a feature. No limit is set by default.
|
||||
required: false
|
||||
- name: --non_overlap_feature
|
||||
type: integer
|
||||
description: |
|
||||
Maximum number of non-overlapping bases in a feature that is allowed in read assignment. No limit is set by default.
|
||||
required: false
|
||||
- name: --read_extension5
|
||||
type: integer
|
||||
description: |
|
||||
Reads are extended upstream by <int> bases from their 5' end.
|
||||
required: false
|
||||
- name: --read_extension3
|
||||
type: integer
|
||||
description: |
|
||||
Reads are extended upstream by <int> bases from their 3' end.
|
||||
required: false
|
||||
- name: --read2pos
|
||||
type: integer
|
||||
description: |
|
||||
Reduce reads to their 5' most base or 3' most base. Read counting is then performed based on the single base the read is reduced to.
|
||||
required: false
|
||||
choices: [3, 5]
|
||||
|
||||
- name: Multi-mapping reads
|
||||
arguments:
|
||||
- name: --multi_mapping
|
||||
alternatives: ["-M"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Multi-mapping reads will also be counted. For a multi-mapping read, all its reported alignments will be counted. The 'NH' tag in BAM/SAM input is used to detect multi-mapping reads.
|
||||
|
||||
- name: Fractional counting
|
||||
arguments:
|
||||
- name: --fraction
|
||||
type: boolean_true
|
||||
description: |
|
||||
Assign fractional counts to features. This option must be used together with '--multi_mapping' or '--overlapping' or both. When '--multi_mapping' is specified, each reported alignment from a multi-mapping read (identified via 'NH' tag) will carry a fractional count of 1/x, instead of 1 (one), where x is the total number of alignments reported for the same read. When '--overlapping' is specified, each overlapping feature will receive a fractional count of 1/y, where y is the total number of features overlapping with the read. When both '--multi_mapping' and '--overlapping' are specified, each alignment will carry a fractional count of 1/(x*y).
|
||||
|
||||
- name: Read filtering
|
||||
arguments:
|
||||
- name: --min_map_quality
|
||||
alternatives: ["-Q"]
|
||||
type: integer
|
||||
description: |
|
||||
The minimum mapping quality score a read must satisfy in order to be counted. For paired-end reads, at least one end should satisfy this criteria. 0 by default.
|
||||
required: false
|
||||
example: 0
|
||||
- name: --split_only
|
||||
type: boolean_true
|
||||
description: |
|
||||
Count split alignments only (ie. alignments with CIGAR string containing 'N'). An example of split alignments is exon-spanning reads in RNA-seq data.
|
||||
- name: --non_split_only
|
||||
type: boolean_true
|
||||
description: |
|
||||
If specified, only non-split alignments (CIGAR strings do not contain letter 'N') will be counted. All the other alignments will be ignored.
|
||||
- name: --primary
|
||||
type: boolean_true
|
||||
description: |
|
||||
Count primary alignments only. Primary alignments are identified using bit 0x100 in SAM/BAM FLAG field.
|
||||
- name: --ignore_dup
|
||||
type: boolean_true
|
||||
description: |
|
||||
Ignore duplicate reads in read counting. Duplicate reads are identified using bit Ox400 in BAM/SAM FLAG field. The whole read pair is ignored if one of the reads is a duplicate read for paired end data.
|
||||
|
||||
- name: Strandedness
|
||||
arguments:
|
||||
- name: --strand
|
||||
alternatives: ["-s"]
|
||||
type: integer
|
||||
description: |
|
||||
Perform strand-specific read counting. A single integer value (applied to all input files) should be provided. Possible values include: 0 (unstranded), 1 (stranded) and 2 (reversely stranded). Default value is 0 (ie. unstranded read counting carried out for all input files).
|
||||
choices: [0, 1, 2]
|
||||
example: 0
|
||||
required: false
|
||||
|
||||
- name: Exon-exon junctions
|
||||
arguments:
|
||||
- name: --ref_fasta
|
||||
alternatives: ["-G"]
|
||||
type: file
|
||||
description: |
|
||||
Provide the name of a FASTA-format file that contains the reference sequences used in read mapping that produced the provided SAM/BAM files.
|
||||
required: false
|
||||
example: reference.fasta
|
||||
|
||||
- name: Parameters specific to paired end reads
|
||||
arguments:
|
||||
- name: --paired
|
||||
alternatives: ["-p"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Specify that input data contain paired-end reads. To perform fragment counting (ie. counting read pairs), the '--countReadPairs' parameter should also be specified in addition to this parameter.
|
||||
- name: --count_read_pairs
|
||||
type: boolean_true
|
||||
description: |
|
||||
Count read pairs (fragments) instead of reads. This option is only applicable for paired-end reads.
|
||||
- name: --both_aligned
|
||||
alternatives: ["-B"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Count read pairs (fragments) instead of reads. This option is only applicable for paired-end reads.
|
||||
- name: --check_pe_dist
|
||||
alternatives: ["-P"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Check validity of paired-end distance when counting read pairs. Use '--min_length' and '--max_length' to set thresholds.
|
||||
- name: --min_length
|
||||
alternatives: ["-d"]
|
||||
type: integer
|
||||
description: |
|
||||
Minimum fragment/template length, 50 by default.
|
||||
required: false
|
||||
example: 50
|
||||
- name: --max_length
|
||||
alternatives: ["-D"]
|
||||
type: integer
|
||||
description: |
|
||||
Maximum fragment/template length, 600 by default.
|
||||
required: false
|
||||
example: 600
|
||||
- name: --same_strand
|
||||
alternatives: ["-C"]
|
||||
type: boolean_true
|
||||
description: |
|
||||
Do not count read pairs that have their two ends mapping to different chromosomes or mapping to same chromosome but on different strands.
|
||||
- name: --donotsort
|
||||
type: boolean_true
|
||||
description: |
|
||||
Do not sort reads in BAM/SAM input. Note that reads from the same pair are required to be located next to each other in the input.
|
||||
|
||||
- name: Read groups
|
||||
arguments:
|
||||
- name: --by_read_group
|
||||
type: boolean_true
|
||||
description: |
|
||||
Assign reads by read group. "RG" tag is required to be present in the input BAM/SAM files.
|
||||
|
||||
- name: Long reads
|
||||
arguments:
|
||||
- name: --long_reads
|
||||
type: boolean_true
|
||||
description: |
|
||||
Count long reads such as Nanopore and PacBio reads. Long read counting can only run in one thread and only reads (not read-pairs) can be counted. There is no limitation on the number of 'M' operations allowed in a CIGAR string in long read counting.
|
||||
|
||||
- name: Assignment results for each read
|
||||
arguments:
|
||||
- name: --detailed_results
|
||||
type: file
|
||||
direction: output
|
||||
description: |
|
||||
Directory to save the detailed assignment results. Use `--detailed_results_format` to determine the format of the detailed results.
|
||||
example: detailed_results/
|
||||
required: false
|
||||
- name: --detailed_results_format
|
||||
alternatives: ["-R"]
|
||||
type: string
|
||||
description: |
|
||||
Output detailed assignment results for each read or read-pair. Results are saved to a file that is in one of the following formats: CORE, SAM and BAM. See documentaiton for more info about these formats.
|
||||
required: false
|
||||
choices: [CORE, SAM, BAM]
|
||||
|
||||
- name: Miscellaneous
|
||||
arguments:
|
||||
- name: --max_M_op
|
||||
type: integer
|
||||
description: |
|
||||
Maximum number of 'M' operations allowed in a CIGAR string. 10 by default. Both 'X' and '=' are treated as 'M' and adjacent 'M' operations are merged in the CIGAR string.
|
||||
required: false
|
||||
example: 10
|
||||
- name: --verbose
|
||||
type: boolean_true
|
||||
description: |
|
||||
Output verbose information for debugging, such as un-matched chromosome/contig names.
|
||||
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- type: file
|
||||
path: test_data
|
||||
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/subread:2.0.6--he4a0461_0
|
||||
setup:
|
||||
- type: docker
|
||||
run: |
|
||||
featureCounts -v 2>&1 | sed 's/featureCounts v\([0-9.]*\)/featureCounts: \1/' > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
242
src/featurecounts/help.txt
Normal file
242
src/featurecounts/help.txt
Normal file
@@ -0,0 +1,242 @@
|
||||
```bash
|
||||
featureCounts
|
||||
```
|
||||
|
||||
Version 2.0.3
|
||||
|
||||
Usage: featureCounts [options] -a <annotation_file> -o <output_file> input_file1 [input_file2] ...
|
||||
|
||||
## Mandatory arguments:
|
||||
|
||||
-a <string> Name of an annotation file. GTF/GFF format by default. See
|
||||
-F option for more format information. Inbuilt annotations
|
||||
(SAF format) is available in 'annotation' directory of the
|
||||
package. Gzipped file is also accepted.
|
||||
|
||||
-o <string> Name of output file including read counts. A separate file
|
||||
including summary statistics of counting results is also
|
||||
included in the output ('<string>.summary'). Both files
|
||||
are in tab delimited format.
|
||||
|
||||
input_file1 [input_file2] ... A list of SAM or BAM format files. They can be
|
||||
either name or location sorted. If no files provided,
|
||||
<stdin> input is expected. Location-sorted paired-end reads
|
||||
are automatically sorted by read names.
|
||||
|
||||
## Optional arguments:
|
||||
# Annotation
|
||||
|
||||
-F <string> Specify format of the provided annotation file. Acceptable
|
||||
formats include 'GTF' (or compatible GFF format) and
|
||||
'SAF'. 'GTF' by default. For SAF format, please refer to
|
||||
Users Guide.
|
||||
|
||||
-t <string> Specify feature type(s) in a GTF annotation. If multiple
|
||||
types are provided, they should be separated by ',' with
|
||||
no space in between. 'exon' by default. Rows in the
|
||||
annotation with a matched feature will be extracted and
|
||||
used for read mapping.
|
||||
|
||||
-g <string> Specify attribute type in GTF annotation. 'gene_id' by
|
||||
default. Meta-features used for read counting will be
|
||||
extracted from annotation using the provided value.
|
||||
|
||||
--extraAttributes Extract extra attribute types from the provided GTF
|
||||
annotation and include them in the counting output. These
|
||||
attribute types will not be used to group features. If
|
||||
more than one attribute type is provided they should be
|
||||
separated by comma.
|
||||
|
||||
-A <string> Provide a chromosome name alias file to match chr names in
|
||||
annotation with those in the reads. This should be a two-
|
||||
column comma-delimited text file. Its first column should
|
||||
include chr names in the annotation and its second column
|
||||
should include chr names in the reads. Chr names are case
|
||||
sensitive. No column header should be included in the
|
||||
file.
|
||||
|
||||
# Level of summarization
|
||||
|
||||
-f Perform read counting at feature level (eg. counting
|
||||
reads for exons rather than genes).
|
||||
|
||||
# Overlap between reads and features
|
||||
|
||||
-O Assign reads to all their overlapping meta-features (or
|
||||
features if -f is specified).
|
||||
|
||||
--minOverlap <int> Minimum number of overlapping bases in a read that is
|
||||
required for read assignment. 1 by default. Number of
|
||||
overlapping bases is counted from both reads if paired
|
||||
end. If a negative value is provided, then a gap of up
|
||||
to specified size will be allowed between read and the
|
||||
feature that the read is assigned to.
|
||||
|
||||
--fracOverlap <float> Minimum fraction of overlapping bases in a read that is
|
||||
required for read assignment. Value should be within range
|
||||
[0,1]. 0 by default. Number of overlapping bases is
|
||||
counted from both reads if paired end. Both this option
|
||||
and '--minOverlap' option need to be satisfied for read
|
||||
assignment.
|
||||
|
||||
--fracOverlapFeature <float> Minimum fraction of overlapping bases in a
|
||||
feature that is required for read assignment. Value
|
||||
should be within range [0,1]. 0 by default.
|
||||
|
||||
--largestOverlap Assign reads to a meta-feature/feature that has the
|
||||
largest number of overlapping bases.
|
||||
|
||||
--nonOverlap <int> Maximum number of non-overlapping bases in a read (or a
|
||||
read pair) that is allowed when being assigned to a
|
||||
feature. No limit is set by default.
|
||||
|
||||
--nonOverlapFeature <int> Maximum number of non-overlapping bases in a feature
|
||||
that is allowed in read assignment. No limit is set by
|
||||
default.
|
||||
|
||||
--readExtension5 <int> Reads are extended upstream by <int> bases from their
|
||||
5' end.
|
||||
|
||||
--readExtension3 <int> Reads are extended upstream by <int> bases from their
|
||||
3' end.
|
||||
|
||||
--read2pos <5:3> Reduce reads to their 5' most base or 3' most base. Read
|
||||
counting is then performed based on the single base the
|
||||
read is reduced to.
|
||||
|
||||
# Multi-mapping reads
|
||||
|
||||
-M Multi-mapping reads will also be counted. For a multi-
|
||||
mapping read, all its reported alignments will be
|
||||
counted. The 'NH' tag in BAM/SAM input is used to detect
|
||||
multi-mapping reads.
|
||||
|
||||
# Fractional counting
|
||||
|
||||
--fraction Assign fractional counts to features. This option must
|
||||
be used together with '-M' or '-O' or both. When '-M' is
|
||||
specified, each reported alignment from a multi-mapping
|
||||
read (identified via 'NH' tag) will carry a fractional
|
||||
count of 1/x, instead of 1 (one), where x is the total
|
||||
number of alignments reported for the same read. When '-O'
|
||||
is specified, each overlapping feature will receive a
|
||||
fractional count of 1/y, where y is the total number of
|
||||
features overlapping with the read. When both '-M' and
|
||||
'-O' are specified, each alignment will carry a fractional
|
||||
count of 1/(x*y).
|
||||
|
||||
# Read filtering
|
||||
|
||||
-Q <int> The minimum mapping quality score a read must satisfy in
|
||||
order to be counted. For paired-end reads, at least one
|
||||
end should satisfy this criteria. 0 by default.
|
||||
|
||||
--splitOnly Count split alignments only (ie. alignments with CIGAR
|
||||
string containing 'N'). An example of split alignments is
|
||||
exon-spanning reads in RNA-seq data.
|
||||
|
||||
--nonSplitOnly If specified, only non-split alignments (CIGAR strings do
|
||||
not contain letter 'N') will be counted. All the other
|
||||
alignments will be ignored.
|
||||
|
||||
--primary Count primary alignments only. Primary alignments are
|
||||
identified using bit 0x100 in SAM/BAM FLAG field.
|
||||
|
||||
--ignoreDup Ignore duplicate reads in read counting. Duplicate reads
|
||||
are identified using bit Ox400 in BAM/SAM FLAG field. The
|
||||
whole read pair is ignored if one of the reads is a
|
||||
duplicate read for paired end data.
|
||||
|
||||
# Strandness
|
||||
|
||||
-s <int or string> Perform strand-specific read counting. A single integer
|
||||
value (applied to all input files) or a string of comma-
|
||||
separated values (applied to each corresponding input
|
||||
file) should be provided. Possible values include:
|
||||
0 (unstranded), 1 (stranded) and 2 (reversely stranded).
|
||||
Default value is 0 (ie. unstranded read counting carried
|
||||
out for all input files).
|
||||
|
||||
# Exon-exon junctions
|
||||
|
||||
-J Count number of reads supporting each exon-exon junction.
|
||||
Junctions were identified from those exon-spanning reads
|
||||
in the input (containing 'N' in CIGAR string). Counting
|
||||
results are saved to a file named '<output_file>.jcounts'
|
||||
|
||||
-G <string> Provide the name of a FASTA-format file that contains the
|
||||
reference sequences used in read mapping that produced the
|
||||
provided SAM/BAM files. This optional argument can be used
|
||||
with '-J' option to improve read counting for junctions.
|
||||
|
||||
# Parameters specific to paired end reads
|
||||
|
||||
-p Specify that input data contain paired-end reads. To
|
||||
perform fragment counting (ie. counting read pairs), the
|
||||
'--countReadPairs' parameter should also be specified in
|
||||
addition to this parameter.
|
||||
|
||||
--countReadPairs Count read pairs (fragments) instead of reads. This option
|
||||
is only applicable for paired-end reads.
|
||||
|
||||
-B Only count read pairs that have both ends aligned.
|
||||
|
||||
-P Check validity of paired-end distance when counting read
|
||||
pairs. Use -d and -D to set thresholds.
|
||||
|
||||
-d <int> Minimum fragment/template length, 50 by default.
|
||||
|
||||
-D <int> Maximum fragment/template length, 600 by default.
|
||||
|
||||
-C Do not count read pairs that have their two ends mapping
|
||||
to different chromosomes or mapping to same chromosome
|
||||
but on different strands.
|
||||
|
||||
--donotsort Do not sort reads in BAM/SAM input. Note that reads from
|
||||
the same pair are required to be located next to each
|
||||
other in the input.
|
||||
|
||||
# Number of CPU threads
|
||||
|
||||
-T <int> Number of the threads. 1 by default.
|
||||
|
||||
# Read groups
|
||||
|
||||
--byReadGroup Assign reads by read group. "RG" tag is required to be
|
||||
present in the input BAM/SAM files.
|
||||
|
||||
|
||||
# Long reads
|
||||
|
||||
-L Count long reads such as Nanopore and PacBio reads. Long
|
||||
read counting can only run in one thread and only reads
|
||||
(not read-pairs) can be counted. There is no limitation on
|
||||
the number of 'M' operations allowed in a CIGAR string in
|
||||
long read counting.
|
||||
|
||||
# Assignment results for each read
|
||||
|
||||
-R <format> Output detailed assignment results for each read or read-
|
||||
pair. Results are saved to a file that is in one of the
|
||||
following formats: CORE, SAM and BAM. See Users Guide for
|
||||
more info about these formats.
|
||||
|
||||
--Rpath <string> Specify a directory to save the detailed assignment
|
||||
results. If unspecified, the directory where counting
|
||||
results are saved is used.
|
||||
|
||||
# Miscellaneous
|
||||
|
||||
--tmpDir <string> Directory under which intermediate files are saved (later
|
||||
removed). By default, intermediate files will be saved to
|
||||
the directory specified in '-o' argument.
|
||||
|
||||
--maxMOp <int> Maximum number of 'M' operations allowed in a CIGAR
|
||||
string. 10 by default. Both 'X' and '=' are treated as 'M'
|
||||
and adjacent 'M' operations are merged in the CIGAR
|
||||
string.
|
||||
|
||||
--verbose Output verbose information for debugging, such as un-
|
||||
matched chromosome/contig names.
|
||||
|
||||
-v Output version of the program.
|
||||
94
src/featurecounts/script.sh
Normal file
94
src/featurecounts/script.sh
Normal file
@@ -0,0 +1,94 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
# create temporary directory
|
||||
tmp_dir=$(mktemp -d -p "$meta_temp_dir" "${meta_functionality_name}_XXXXXX")
|
||||
mkdir -p "$tmp_dir/temp"
|
||||
|
||||
# create detailed_results directory if variable is set and directory does not exist
|
||||
if [[ ! -z "$par_detailed_results" ]] && [[ ! -d "$par_detailed_results" ]]; then
|
||||
mkdir -p "$par_detailed_results"
|
||||
fi
|
||||
|
||||
# replace comma with semicolon
|
||||
par_feature_type=$(echo $par_feature_type | tr ',' ';')
|
||||
par_extra_attributes=$(echo $par_extra_attributes | tr ',' ';')
|
||||
|
||||
# unset flag variables
|
||||
[[ "$par_feature_level" == "false" ]] && unset par_feature_level
|
||||
[[ "$par_overlapping" == "false" ]] && unset par_overlapping
|
||||
[[ "$par_largest_overlap" == "false" ]] && unset par_largest_overlap
|
||||
[[ "$par_multi_mapping" == "false" ]] && unset par_multi_mapping
|
||||
[[ "$par_fraction" == "false" ]] && unset par_fraction
|
||||
[[ "$par_split_only" == "false" ]] && unset par_split_only
|
||||
[[ "$par_non_split_only" == "false" ]] && unset par_non_split_only
|
||||
[[ "$par_primary" == "false" ]] && unset par_primary
|
||||
[[ "$par_ignore_dup" == "false" ]] && unset par_ignore_dup
|
||||
[[ "$par_paired" == "false" ]] && unset par_paired
|
||||
[[ "$par_count_read_pairs" == "false" ]] && unset par_count_read_pairs
|
||||
[[ "$par_both_aligned" == "false" ]] && unset par_both_aligned
|
||||
[[ "$par_check_pe_dist" == "false" ]] && unset par_check_pe_dist
|
||||
[[ "$par_same_strand" == "false" ]] && unset par_same_strand
|
||||
[[ "$par_donotsort" == "false" ]] && unset par_donotsort
|
||||
[[ "$par_by_read_group" == "false" ]] && unset par_by_read_group
|
||||
[[ "$par_long_reads" == "false" ]] && unset par_long_reads
|
||||
[[ "$par_verbose" == "false" ]] && unset par_verbose
|
||||
|
||||
IFS=";" read -ra input <<< $par_input
|
||||
|
||||
featureCounts \
|
||||
${par_format:+-F "${par_format}"} \
|
||||
${par_feature_type:+-t "${par_feature_type}"} \
|
||||
${par_attribute_type:+-g "${par_attribute_type}"} \
|
||||
${par_extra_attributes:+--extraAttributes "${extra_attributes}"} \
|
||||
${par_chrom_alias:+-A "${par_chrom_alias}"} \
|
||||
${par_feature_level:+-f} \
|
||||
${par_overlapping:+-O} \
|
||||
${par_min_overlap:+--minOverlap "${par_min_overlap}"} \
|
||||
${par_frac_overlap:+--fracOverlap "${par_frac_overlap}"} \
|
||||
${par_frac_overlap_feature:+--fracOverlapFeature "${par_frac_overlap_feature}"} \
|
||||
${par_largest_overlap:+--largestOverlap} \
|
||||
${par_non_overlap:+--nonOverlap "${par_non_overlap}"} \
|
||||
${par_non_overlap_feature:+--nonOverlapFeature "${par_non_overlap_feature}"} \
|
||||
${par_read_extension5:+--readExtension5 "${par_read_extension5}"} \
|
||||
${par_read_extension3:+--readExtension3 "${par_read_extension3}"} \
|
||||
${par_read2pos:+--read2pos "${par_read2pos}"} \
|
||||
${par_multi_mapping:+-M} \
|
||||
${par_fraction:+--fraction} \
|
||||
${par_min_map_quality:+-Q "${par_min_map_quality}"} \
|
||||
${par_split_only:+--splitOnly} \
|
||||
${par_non_split_only:+--nonSplitOnly} \
|
||||
${par_primary:+--primary} \
|
||||
${par_ignore_dup:+--ignoreDup} \
|
||||
${par_strand:+-s "${par_strand}"} \
|
||||
${par_junctions:+-J} \
|
||||
${par_ref_fasta:+-G "${par_ref_fasta}"} \
|
||||
${par_paired:+-p} \
|
||||
${par_count_read_pairs:+--countReadPairs} \
|
||||
${par_both_aligned:+-B} \
|
||||
${par_check_pe_dist:+-P} \
|
||||
${par_min_length:+-d "${par_min_length}"} \
|
||||
${par_max_length:+-D "${par_max_length}"} \
|
||||
${par_same_strand:+-C} \
|
||||
${par_donotsort:+--donotsort} \
|
||||
${par_by_read_group:+--byReadGroup} \
|
||||
${par_long_reads:+-L} \
|
||||
${par_detailed_results:+--Rpath "${par_detailed_results}"} \
|
||||
${par_detailed_results_format:+-R "${par_detailed_results_format}"} \
|
||||
${par_max_M_op:+--maxMOp "${par_max_M_op}"} \
|
||||
${par_verbose:+--verbose} \
|
||||
${meta_cpus:+-T "${meta_cpus}"} \
|
||||
--tmpDir "$tmp_dir/temp" \
|
||||
-a "$par_annotation" \
|
||||
-o "$tmp_dir/output.txt" \
|
||||
"${input[*]}"
|
||||
|
||||
[[ ! -z "$par_counts" ]] && mv "$tmp_dir/output.txt" "$par_counts"
|
||||
[[ ! -z "$par_summary" ]] && mv "$tmp_dir/output.txt.summary" "$par_summary"
|
||||
if [[ ! -z "$par_junctions" ]] && [[ -e "$tmp_dir/output.txt.jcounts" ]]; then
|
||||
mv "$tmp_dir/output.txt.jcounts" "$par_junctions"
|
||||
fi
|
||||
59
src/featurecounts/test.sh
Normal file
59
src/featurecounts/test.sh
Normal file
@@ -0,0 +1,59 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
dir_in="$meta_resources_dir/test_data"
|
||||
|
||||
echo "> Run featureCounts (with junctions)"
|
||||
"$meta_executable" \
|
||||
--input "$dir_in/a.bam" \
|
||||
--annotation "$dir_in/annotation.gtf" \
|
||||
--counts "features.tsv" \
|
||||
--summary "summary.tsv" \
|
||||
--junctions "junction_counts.txt" \
|
||||
--ref_fasta "$dir_in/genome.fasta" \
|
||||
--overlapping \
|
||||
--frac_overlap 0.2 \
|
||||
--paired \
|
||||
--strand 0 \
|
||||
--detailed_results detailed_results \
|
||||
--detailed_results_format SAM
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "features.tsv" ] && echo "Output file features.tsv does not exist" && exit 1
|
||||
[ ! -f "summary.tsv" ] && echo "Output file summary.tsv does not exist" && exit 1
|
||||
[ ! -f "junction_counts.txt" ] && echo "Output file junction_counts.txt does not exist" && exit 1
|
||||
[ ! -d "detailed_results" ] && echo "Output directory detailed_results does not exist" && exit 1
|
||||
[ ! -f "detailed_results/a.bam.featureCounts.sam" ] && echo "Output file detailed_results/a.bam.featureCounts.sam does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "features.tsv" ] && echo "Output file features.tsv is empty" && exit 1
|
||||
[ ! -s "summary.tsv" ] && echo "Output file summary.tsv is empty" && exit 1
|
||||
[ ! -s "junction_counts.txt" ] && echo "Output file junction_counts.txt is empty" && exit 1
|
||||
[ ! -s "detailed_results/a.bam.featureCounts.sam" ] && echo "Output file detailed_results/a.bam.featureCounts.sam is empty" && exit 1
|
||||
|
||||
echo "> Run featureCounts (without junctions)"
|
||||
"$meta_executable" \
|
||||
--input "$dir_in/a.bam" \
|
||||
--annotation "$dir_in/annotation.gtf" \
|
||||
--counts "features.tsv" \
|
||||
--summary "summary.tsv" \
|
||||
--overlapping \
|
||||
--frac_overlap 0.2 \
|
||||
--paired \
|
||||
--strand 0 \
|
||||
--detailed_results detailed_results \
|
||||
--detailed_results_format SAM
|
||||
|
||||
echo ">> Checking output"
|
||||
[ ! -f "features.tsv" ] && echo "Output file features.tsv does not exist" && exit 1
|
||||
[ ! -f "summary.tsv" ] && echo "Output file summary.tsv does not exist" && exit 1
|
||||
[ ! -d "detailed_results" ] && echo "Output directory detailed_results does not exist" && exit 1
|
||||
[ ! -f "detailed_results/a.bam.featureCounts.sam" ] && echo "Output file detailed_results/a.bam.featureCounts.sam does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "features.tsv" ] && echo "Output file features.tsv is empty" && exit 1
|
||||
[ ! -s "summary.tsv" ] && echo "Output file summary.tsv is empty" && exit 1
|
||||
[ ! -s "detailed_results/a.bam.featureCounts.sam" ] && echo "Output file detailed_results/a.bam.featureCounts.sam is empty" && exit 1
|
||||
|
||||
echo "> Test successful"
|
||||
BIN
src/featurecounts/test_data/a.bam
Normal file
BIN
src/featurecounts/test_data/a.bam
Normal file
Binary file not shown.
6
src/featurecounts/test_data/annotation.gtf
Normal file
6
src/featurecounts/test_data/annotation.gtf
Normal file
@@ -0,0 +1,6 @@
|
||||
1 havana gene 1 80 . + . gene_id "ENSG00000000000"; gene_version "5"; gene_name "A"; gene_source "havana"; gene_biotype "gene";
|
||||
1 havana transcript 1 80 . + . gene_id "ENSG00000000000"; gene_version "5"; transcript_id "ENST00000000000"; transcript_version "2"; gene_name "A"; gene_source "havana"; gene_biotype "gene"; transcript_name "A-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; tag "basic"; transcript_support_level "1";
|
||||
1 havana exon 1 80 . + . gene_id "ENSG00000000000"; gene_version "5"; transcript_id "ENST00000000000"; transcript_version "2"; exon_number "1"; gene_name "A"; gene_source "havana"; gene_biotype "gene"; transcript_name "A-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00000000000"; exon_version "1"; tag "basic"; transcript_support_level "1";
|
||||
2 havana gene 1 80 . + . gene_id "ENSG00000000001"; gene_version "5"; gene_name "B"; gene_source "havana"; gene_biotype "gene";
|
||||
2 havana transcript 1 80 . + . gene_id "ENSG00000000001"; gene_version "5"; transcript_id "ENST00000000001"; transcript_version "2"; gene_name "B"; gene_source "havana"; gene_biotype "gene"; transcript_name "B-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; tag "basic"; transcript_support_level "1";
|
||||
2 havana exon 1 80 . + . gene_id "ENSG00000000001"; gene_version "5"; transcript_id "ENST00000000001"; transcript_version "2"; exon_number "1"; gene_name "B"; gene_source "havana"; gene_biotype "gene"; transcript_name "B-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00000000001"; exon_version "1"; tag "basic"; transcript_support_level "1";
|
||||
4
src/featurecounts/test_data/genome.fasta
Normal file
4
src/featurecounts/test_data/genome.fasta
Normal file
@@ -0,0 +1,4 @@
|
||||
>1
|
||||
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
|
||||
>2
|
||||
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
||||
9
src/featurecounts/test_data/script.sh
Normal file
9
src/featurecounts/test_data/script.sh
Normal file
@@ -0,0 +1,9 @@
|
||||
# featureCounts test data
|
||||
|
||||
# Test data was obtained from https://github.com/snakemake/snakemake-wrappers/tree/master/bio/subread/featurecounts/test
|
||||
|
||||
if [ ! -d /tmp/snakemake-wrappers ]; then
|
||||
git clone --depth 1 --single-branch --branch master https://github.com/snakemake/snakemake-wrappers /tmp/snakemake-wrappers
|
||||
fi
|
||||
|
||||
cp -r /tmp/snakemake-wrappers/bio/subread/featurecounts/test/* src/subread/featurecounts/test_data
|
||||
397
src/gffread/config.vsh.yaml
Normal file
397
src/gffread/config.vsh.yaml
Normal file
@@ -0,0 +1,397 @@
|
||||
name: gffread
|
||||
description: Validate, filter, convert and perform various other operations on GFF files.
|
||||
keywords: [gff, conversion, validation, filtering]
|
||||
links:
|
||||
homepage: https://ccb.jhu.edu/software/stringtie/gff.shtml#gffread
|
||||
documentation: https://ccb.jhu.edu/software/stringtie/gff.shtml#gffread
|
||||
repository: https://github.com/gpertea/gffread
|
||||
references:
|
||||
doi: 10.12688/f1000research.23297.2
|
||||
license: MIT
|
||||
authors:
|
||||
- __merge__: /src/_authors/emma_rousseau.yaml
|
||||
roles: [ author, maintainer ]
|
||||
argument_groups:
|
||||
- name: Inputs
|
||||
arguments:
|
||||
- name: --input
|
||||
type: file
|
||||
direction: input
|
||||
description: |
|
||||
A reference file in either the GFF3, GFF2 or GTF format.
|
||||
required: true
|
||||
example: annotation.gff
|
||||
- name: --chr_mapping
|
||||
alternatives: -m
|
||||
type: file
|
||||
direction: input
|
||||
description: |
|
||||
<chr_replace> is a name mapping table for converting reference sequence names,
|
||||
having this 2-column format: <original_ref_ID> <new_ref_ID>.
|
||||
- name: --seq_info
|
||||
alternatives: -s
|
||||
type: file
|
||||
direction: input
|
||||
description: |
|
||||
<seq_info.fsize> is a tab-delimited file providing this info for each of the mapped
|
||||
sequences: <seq-name> <seq-length> <seq-description> (useful for --description option with
|
||||
mRNA/EST/protein mappings).
|
||||
- name: --genome
|
||||
alternatives: -g
|
||||
type: file
|
||||
description: |
|
||||
Full path to a multi-fasta file with the genomic sequences for all input mappings,
|
||||
OR a directory with single-fasta files (one per genomic sequence, with file names
|
||||
matching sequence names).
|
||||
example: genome.fa
|
||||
- name: Outputs
|
||||
arguments:
|
||||
- name: --outfile
|
||||
alternatives: -o
|
||||
type: file
|
||||
direction: output
|
||||
required: true
|
||||
description: |
|
||||
Write the output records into <outfile>.
|
||||
example: output.gff
|
||||
- name: --force_exons
|
||||
type: boolean_true
|
||||
description: |
|
||||
Make sure that the lowest level GFF features are considered "exon" features.
|
||||
- name: --gene2exon
|
||||
type: boolean_true
|
||||
description: |
|
||||
For single-line genes not parenting any transcripts, add an exon feature spanning
|
||||
the entire gene (treat it as a transcript).
|
||||
- name: --t_adopt
|
||||
type: boolean_true
|
||||
description: |
|
||||
Try to find a parent gene overlapping/containing a transcript that does not have
|
||||
any explicit gene Parent.
|
||||
- name: --decode
|
||||
alternatives: -D
|
||||
type: boolean_true
|
||||
description: |
|
||||
Decode url encoded characters within attributes.
|
||||
- name: --merge_exons
|
||||
alternatives: -Z
|
||||
type: boolean_true
|
||||
description: |
|
||||
Merge very close exons into a single exon (when intron size<4).
|
||||
- name: --junctions
|
||||
alternatives: -j
|
||||
type: boolean_true
|
||||
description: |
|
||||
Output the junctions and the corresponding transcripts.
|
||||
- name: --spliced_exons
|
||||
alternatives: -w
|
||||
type: file
|
||||
direction: output
|
||||
must_exist: false
|
||||
description: |
|
||||
Write a fasta file with spliced exons for each transcript.
|
||||
example: exons.fa
|
||||
- name: --w_add
|
||||
type: integer
|
||||
description: |
|
||||
For the --spliced_exons option, extract additional <N> bases both upstream and
|
||||
downstream of the transcript boundaries.
|
||||
- name: --w_nocds
|
||||
type: boolean_true
|
||||
description: |
|
||||
For --spliced_exons, disable the output of CDS info in the FASTA file.
|
||||
- name: --spliced_cds
|
||||
alternatives: -x
|
||||
type: file
|
||||
must_exist: false
|
||||
example: cds.fa
|
||||
description: |
|
||||
Write a fasta file with spliced CDS for each GFF transcript.
|
||||
- name: --tr_cds
|
||||
alternatives: -y
|
||||
type: file
|
||||
must_exist: false
|
||||
example: tr_cds.fa
|
||||
description: |
|
||||
Write a protein fasta file with the translation of CDS for each record.
|
||||
- name: --w_coords
|
||||
alternatives: -W
|
||||
type: boolean_true
|
||||
description: |
|
||||
For --spliced_exons, --spliced_cds and -tr_cds options, write in the FASTA defline
|
||||
all the exon coordinates projected onto the spliced sequence.
|
||||
- name: --stop_dot
|
||||
alternatives: -S
|
||||
type: boolean_true
|
||||
description: |
|
||||
For --tr_cds option, use '*' instead of '.' as stop codon translation.
|
||||
- name: --id_version
|
||||
alternatives: -L
|
||||
type: boolean_true
|
||||
description: |
|
||||
Ensembl GTF to GFF3 conversion, adds version to IDs.
|
||||
- name: --trackname
|
||||
alternatives: -t
|
||||
type: string
|
||||
description: |
|
||||
Use <trackname> in the 2nd column of each GFF/GTF output line.
|
||||
- name: --gtf_output
|
||||
alternatives: -T
|
||||
type: boolean_true
|
||||
description: |
|
||||
Main output will be GTF instead of GFF3.
|
||||
- name: --bed
|
||||
type: boolean_true
|
||||
description: |
|
||||
Output records in BED format instead of default GFF3.
|
||||
- name: --tlf
|
||||
type: boolean_true
|
||||
description: |
|
||||
Output "transcript line format" which is like GFF but with exons and CDS related
|
||||
features stored as GFF attributes in the transcript feature line, like this:
|
||||
exoncount=N;exons=<exons>;CDSphase=<N>;CDS=<CDScoords>
|
||||
<exons> is a comma-delimited list of exon_start-exon_end coordinates;
|
||||
<CDScoords> is CDS_start:CDS_end coordinates or a list like <exons>.
|
||||
- name: --table
|
||||
type: string
|
||||
multiple: true
|
||||
description: |
|
||||
Output a simple tab delimited format instead of GFF, with columns having the values
|
||||
of GFF attributes given in <attrlist>; special pseudo-attributes (prefixed by @) are
|
||||
recognized:
|
||||
@id, @geneid, @chr, @start, @end, @strand, @numexons, @exons, @cds, @covlen, @cdslen
|
||||
If any of --spliced_exons/--tr_cds/--spliced_cds FASTA output files are enabled, the
|
||||
same fields (excluding @id) are appended to the definition line of corresponding FASTA
|
||||
records.
|
||||
- name: --expose_dups
|
||||
type: boolean_true
|
||||
alternatives: [-E, -v]
|
||||
description: |
|
||||
Expose (warn about) duplicate transcript IDs and other potential problems with the
|
||||
given GFF/GTF records.
|
||||
- name: Options
|
||||
arguments:
|
||||
- name: --ids
|
||||
type: file
|
||||
description: |
|
||||
Discard records/transcripts if their IDs are not listed in <IDs.lst>.
|
||||
- name: --nids
|
||||
type: file
|
||||
description: |
|
||||
Discard records/transcripts if their IDs are listed in <IDs.lst>.
|
||||
- name: --maxintron
|
||||
alternatives: -i
|
||||
type: integer
|
||||
description: |
|
||||
Discard transcripts having an intron larger than <maxintron>.
|
||||
- name: --minlen
|
||||
alternatives: -l
|
||||
type: integer
|
||||
description: |
|
||||
Discard transcripts shorter than <minlen> bases.
|
||||
- name: --range
|
||||
alternatives: -r
|
||||
type: string
|
||||
description: |
|
||||
Only show transcripts overlapping coordinate range <start>..<end> (on chromosome/contig
|
||||
<chr>, strand <strand> if provided).
|
||||
- name: --strict_range
|
||||
alternatives: -R
|
||||
type: boolean_true
|
||||
description: |
|
||||
For --range option, discard all transcripts that are not fully contained within the given
|
||||
range.
|
||||
- name: --jmatch
|
||||
type: string
|
||||
description: |
|
||||
Only output transcripts matching the given junction.
|
||||
- name: --no_single_exon
|
||||
alternatives: -U
|
||||
type: boolean_true
|
||||
description: |
|
||||
Discard single-exon transcripts.
|
||||
- name: --coding
|
||||
alternatives: -C
|
||||
type: boolean_true
|
||||
description: |
|
||||
Coding only: discard mRNAs that have no CDS features.
|
||||
- name: --nc
|
||||
type: boolean_true
|
||||
description: |
|
||||
Non-coding only: discard mRNAs that have CDS features.
|
||||
- name: --ignore_locus
|
||||
type: boolean_true
|
||||
description: |
|
||||
Discard locus features and attributes found in the input.
|
||||
- name: --description
|
||||
alternatives: -A
|
||||
type: boolean_true
|
||||
description: |
|
||||
Use the description field from <seq_info.fsize> and add it as the value for a 'descr'
|
||||
attribute to the GFF record.
|
||||
|
||||
- name: Sorting
|
||||
arguments:
|
||||
- name: --sort_alpha
|
||||
type: boolean_true
|
||||
description: |
|
||||
Chromosomes (reference sequences) are sorted alphabetically.
|
||||
- name: --sort_by
|
||||
type: file
|
||||
must_exist: true
|
||||
description: |
|
||||
Sort the reference sequences by the order in which their names are given in the
|
||||
<refseq.lst> file.
|
||||
- name: Misc options
|
||||
arguments:
|
||||
- name: --keep_attrs
|
||||
alternatives: -F
|
||||
type: boolean_true
|
||||
description: |
|
||||
Keep all GFF attributes (for non-exon features).
|
||||
- name: --keep_exon_attrs
|
||||
type: boolean_true
|
||||
description: |
|
||||
For -F option, do not attempt to reduce redundant exon/CDS attributes.
|
||||
- name: --no_exon_attrs
|
||||
alternatives: -G
|
||||
type: boolean_true
|
||||
description: |
|
||||
Do not keep exon attributes, move them to the transcript feature (for GFF3 output).
|
||||
- name: --attrs
|
||||
type: string
|
||||
description: |
|
||||
Only output the GTF/GFF attributes listed in <attr-list> which is a comma delimited
|
||||
list of attribute names to.
|
||||
- name: --keep_genes
|
||||
type: boolean_true
|
||||
description: |
|
||||
In transcript-only mode (default), also preserve gene records.
|
||||
- name: --keep_comments
|
||||
type: boolean_true
|
||||
description: |
|
||||
For GFF3 input/output, try to preserve comments.
|
||||
- name: --process_other
|
||||
alternatives: -O
|
||||
type: boolean_true
|
||||
description: |
|
||||
process other non-transcript GFF records (by default non-transcript records are ignored).
|
||||
- name: --rm_stop_codons
|
||||
alternatives: -V
|
||||
type: boolean_true
|
||||
description: |
|
||||
Discard any mRNAs with CDS having in-frame stop codons (requires --genome).
|
||||
- name: --adj_cds_start
|
||||
alternatives: -H
|
||||
type: boolean_true
|
||||
description: |
|
||||
For --rm_stop_codons option, check and adjust the starting CDS phase if the original phase
|
||||
leads to a translation with an in-frame stop codon.
|
||||
- name: --opposite_strand
|
||||
alternatives: -B
|
||||
type: boolean_true
|
||||
description: |
|
||||
For -V option, single-exon transcripts are also checked on the opposite strand (requires
|
||||
--genome).
|
||||
- name: --coding_status
|
||||
alternatives: -P
|
||||
type: boolean_true
|
||||
description: |
|
||||
Add transcript level GFF attributes about the coding status of each transcript, including
|
||||
partialness or in-frame stop codons (requires --genome).
|
||||
- name: --add_hasCDS
|
||||
type: boolean_true
|
||||
description: |
|
||||
Add a "hasCDS" attribute with value "true" for transcripts that have CDS features.
|
||||
- name: --adj_stop
|
||||
type: boolean_true
|
||||
description: |
|
||||
Stop codon adjustment: enables --coding_status and performs automatic adjustment of the CDS stop
|
||||
coordinate if premature or downstream.
|
||||
- name: --rm_noncanon
|
||||
alternatives: -N
|
||||
type: boolean_true
|
||||
description: |
|
||||
Discard multi-exon mRNAs that have any intron with a non-canonical splice site consensus
|
||||
(i.e. not GT-AG, GC-AG or AT-AC).
|
||||
- name: --complete_cds
|
||||
alternatives: -J
|
||||
type: boolean_true
|
||||
description: |
|
||||
Discard any mRNAs that either lack initial START codon or the terminal STOP codon, or
|
||||
have an in-frame stop codon (i.e. only print mRNAs with a complete CDS).
|
||||
- name: --no_pseudo
|
||||
type: boolean_true
|
||||
description: |
|
||||
Filter out records matching the 'pseudo' keyword.
|
||||
- name: --in_bed
|
||||
type: boolean_true
|
||||
description: |
|
||||
Input should be parsed as BED format (automatic if the input filename ends with .bed*).
|
||||
- name: --in_tlf
|
||||
type: boolean_true
|
||||
description: |
|
||||
Input GFF-like one-line-per-transcript format without exon/CDS features (see --tlf option
|
||||
below); automatic if the input filename ends with .tlf).
|
||||
- name: --stream
|
||||
type: boolean_true
|
||||
description: |
|
||||
Fast processing of input GFF/BED transcripts as they are received (no sorting, exons must
|
||||
be grouped by transcript in the input data).
|
||||
|
||||
- name: Clustering
|
||||
arguments:
|
||||
- name: --merge
|
||||
alternatives: -M
|
||||
type: boolean_true
|
||||
description: |
|
||||
Cluster the input transcripts into loci, discarding "redundant" transcripts (those with
|
||||
the same exact introns and fully contained or equal boundaries).
|
||||
- name: --dupinfo
|
||||
alternatives: -d
|
||||
type: file
|
||||
description: |
|
||||
For --merge option, write duplication info to file <dupinfo>.
|
||||
- name: --cluster_only
|
||||
type: boolean_true
|
||||
description: |
|
||||
Same as --merge but without discarding any of the "duplicate" transcripts, only create
|
||||
"locus" features.
|
||||
- name: --rm_redundant
|
||||
alternatives: -K
|
||||
type: boolean_true
|
||||
description: |
|
||||
For --merge option: also discard as redundant the shorter, fully contained transcripts (intron
|
||||
chains matching a part of the container).
|
||||
- name: --no_boundary
|
||||
alternatives: -Q
|
||||
type: boolean_true
|
||||
description: |
|
||||
For --merge option, no longer require boundary containment when assessing redundancy (can be
|
||||
combined with --rm_redundant); only introns have to match for multi-exon transcripts, and >=80%
|
||||
overlap for single-exon transcripts.
|
||||
- name: --no_overlap
|
||||
alternatives: -Y
|
||||
type: boolean_true
|
||||
description: |
|
||||
For --merge option, enforce --no_boundary but also discard overlapping single-exon transcripts,
|
||||
even on the opposite strand (can be combined with --rm_redudant).
|
||||
|
||||
resources:
|
||||
- type: bash_script
|
||||
path: script.sh
|
||||
test_resources:
|
||||
- type: bash_script
|
||||
path: test.sh
|
||||
- type: file
|
||||
path: test_data
|
||||
engines:
|
||||
- type: docker
|
||||
image: quay.io/biocontainers/gffread:0.12.7--hdcf5f25_3
|
||||
setup:
|
||||
- type: docker
|
||||
run: |
|
||||
echo "gffread: \"$(gffread --version 2>&1)\"" > /var/software_versions.txt
|
||||
runners:
|
||||
- type: executable
|
||||
- type: nextflow
|
||||
140
src/gffread/help.txt
Normal file
140
src/gffread/help.txt
Normal file
@@ -0,0 +1,140 @@
|
||||
```sh
|
||||
gffread --help
|
||||
```
|
||||
|
||||
gffread v0.12.7. Usage:
|
||||
gffread [-g <genomic_seqs_fasta> | <dir>] [-s <seq_info.fsize>]
|
||||
[-o <outfile>] [-t <trackname>] [-r [<strand>]<chr>:<start>-<end> [-R]]
|
||||
[--jmatch <chr>:<start>-<end>] [--no-pseudo]
|
||||
[-CTVNJMKQAFPGUBHZWTOLE] [-w <exons.fa>] [-x <cds.fa>] [-y <tr_cds.fa>]
|
||||
[-j ][--ids <IDs.lst> | --nids <IDs.lst>] [--attrs <attr-list>] [-i <maxintron>]
|
||||
[--stream] [--bed | --gtf | --tlf] [--table <attrlist>] [--sort-by <ref.lst>]
|
||||
[<input_gff>]
|
||||
|
||||
Filter, convert or cluster GFF/GTF/BED records, extract the sequence of
|
||||
transcripts (exon or CDS) and more.
|
||||
By default (i.e. without -O) only transcripts are processed, discarding any
|
||||
other non-transcript features. Default output is a simplified GFF3 with only
|
||||
the basic attributes.
|
||||
|
||||
Options:
|
||||
--ids discard records/transcripts if their IDs are not listed in <IDs.lst>
|
||||
--nids discard records/transcripts if their IDs are listed in <IDs.lst>
|
||||
-i discard transcripts having an intron larger than <maxintron>
|
||||
-l discard transcripts shorter than <minlen> bases
|
||||
-r only show transcripts overlapping coordinate range <start>..<end>
|
||||
(on chromosome/contig <chr>, strand <strand> if provided)
|
||||
-R for -r option, discard all transcripts that are not fully
|
||||
contained within the given range
|
||||
--jmatch only output transcripts matching the given junction
|
||||
-U discard single-exon transcripts
|
||||
-C coding only: discard mRNAs that have no CDS features
|
||||
--nc non-coding only: discard mRNAs that have CDS features
|
||||
--ignore-locus : discard locus features and attributes found in the input
|
||||
-A use the description field from <seq_info.fsize> and add it
|
||||
as the value for a 'descr' attribute to the GFF record
|
||||
-s <seq_info.fsize> is a tab-delimited file providing this info
|
||||
for each of the mapped sequences:
|
||||
<seq-name> <seq-length> <seq-description>
|
||||
(useful for -A option with mRNA/EST/protein mappings)
|
||||
Sorting: (by default, chromosomes are kept in the order they were found)
|
||||
--sort-alpha : chromosomes (reference sequences) are sorted alphabetically
|
||||
--sort-by : sort the reference sequences by the order in which their
|
||||
names are given in the <refseq.lst> file
|
||||
Misc options:
|
||||
-F keep all GFF attributes (for non-exon features)
|
||||
--keep-exon-attrs : for -F option, do not attempt to reduce redundant
|
||||
exon/CDS attributes
|
||||
-G do not keep exon attributes, move them to the transcript feature
|
||||
(for GFF3 output)
|
||||
--attrs <attr-list> only output the GTF/GFF attributes listed in <attr-list>
|
||||
which is a comma delimited list of attribute names to
|
||||
--keep-genes : in transcript-only mode (default), also preserve gene records
|
||||
--keep-comments: for GFF3 input/output, try to preserve comments
|
||||
-O process other non-transcript GFF records (by default non-transcript
|
||||
records are ignored)
|
||||
-V discard any mRNAs with CDS having in-frame stop codons (requires -g)
|
||||
-H for -V option, check and adjust the starting CDS phase
|
||||
if the original phase leads to a translation with an
|
||||
in-frame stop codon
|
||||
-B for -V option, single-exon transcripts are also checked on the
|
||||
opposite strand (requires -g)
|
||||
-P add transcript level GFF attributes about the coding status of each
|
||||
transcript, including partialness or in-frame stop codons (requires -g)
|
||||
--add-hasCDS : add a "hasCDS" attribute with value "true" for transcripts
|
||||
that have CDS features
|
||||
--adj-stop stop codon adjustment: enables -P and performs automatic
|
||||
adjustment of the CDS stop coordinate if premature or downstream
|
||||
-N discard multi-exon mRNAs that have any intron with a non-canonical
|
||||
splice site consensus (i.e. not GT-AG, GC-AG or AT-AC)
|
||||
-J discard any mRNAs that either lack initial START codon
|
||||
or the terminal STOP codon, or have an in-frame stop codon
|
||||
(i.e. only print mRNAs with a complete CDS)
|
||||
--no-pseudo: filter out records matching the 'pseudo' keyword
|
||||
--in-bed: input should be parsed as BED format (automatic if the input
|
||||
filename ends with .bed*)
|
||||
--in-tlf: input GFF-like one-line-per-transcript format without exon/CDS
|
||||
features (see --tlf option below); automatic if the input
|
||||
filename ends with .tlf)
|
||||
--stream: fast processing of input GFF/BED transcripts as they are received
|
||||
((no sorting, exons must be grouped by transcript in the input data)
|
||||
Clustering:
|
||||
-M/--merge : cluster the input transcripts into loci, discarding
|
||||
"redundant" transcripts (those with the same exact introns
|
||||
and fully contained or equal boundaries)
|
||||
-d <dupinfo> : for -M option, write duplication info to file <dupinfo>
|
||||
--cluster-only: same as -M/--merge but without discarding any of the
|
||||
"duplicate" transcripts, only create "locus" features
|
||||
-K for -M option: also discard as redundant the shorter, fully contained
|
||||
transcripts (intron chains matching a part of the container)
|
||||
-Q for -M option, no longer require boundary containment when assessing
|
||||
redundancy (can be combined with -K); only introns have to match for
|
||||
multi-exon transcripts, and >=80% overlap for single-exon transcripts
|
||||
-Y for -M option, enforce -Q but also discard overlapping single-exon
|
||||
transcripts, even on the opposite strand (can be combined with -K)
|
||||
Output options:
|
||||
--force-exons: make sure that the lowest level GFF features are considered
|
||||
"exon" features
|
||||
--gene2exon: for single-line genes not parenting any transcripts, add an
|
||||
exon feature spanning the entire gene (treat it as a transcript)
|
||||
--t-adopt: try to find a parent gene overlapping/containing a transcript
|
||||
that does not have any explicit gene Parent
|
||||
-D decode url encoded characters within attributes
|
||||
-Z merge very close exons into a single exon (when intron size<4)
|
||||
-g full path to a multi-fasta file with the genomic sequences
|
||||
for all input mappings, OR a directory with single-fasta files
|
||||
(one per genomic sequence, with file names matching sequence names)
|
||||
-j output the junctions and the corresponding transcripts
|
||||
-w write a fasta file with spliced exons for each transcript
|
||||
--w-add <N> for the -w option, extract additional <N> bases
|
||||
both upstream and downstream of the transcript boundaries
|
||||
--w-nocds for -w, disable the output of CDS info in the FASTA file
|
||||
-x write a fasta file with spliced CDS for each GFF transcript
|
||||
-y write a protein fasta file with the translation of CDS for each record
|
||||
-W for -w, -x and -y options, write in the FASTA defline all the exon
|
||||
coordinates projected onto the spliced sequence;
|
||||
-S for -y option, use '*' instead of '.' as stop codon translation
|
||||
-L Ensembl GTF to GFF3 conversion, adds version to IDs
|
||||
-m <chr_replace> is a name mapping table for converting reference
|
||||
sequence names, having this 2-column format:
|
||||
<original_ref_ID> <new_ref_ID>
|
||||
-t use <trackname> in the 2nd column of each GFF/GTF output line
|
||||
-o write the output records into <outfile> instead of stdout
|
||||
-T main output will be GTF instead of GFF3
|
||||
--bed output records in BED format instead of default GFF3
|
||||
--tlf output "transcript line format" which is like GFF
|
||||
but with exons and CDS related features stored as GFF
|
||||
attributes in the transcript feature line, like this:
|
||||
exoncount=N;exons=<exons>;CDSphase=<N>;CDS=<CDScoords>
|
||||
<exons> is a comma-delimited list of exon_start-exon_end coordinates;
|
||||
<CDScoords> is CDS_start:CDS_end coordinates or a list like <exons>
|
||||
--table output a simple tab delimited format instead of GFF, with columns
|
||||
having the values of GFF attributes given in <attrlist>; special
|
||||
pseudo-attributes (prefixed by @) are recognized:
|
||||
@id, @geneid, @chr, @start, @end, @strand, @numexons, @exons,
|
||||
@cds, @covlen, @cdslen
|
||||
If any of -w/-y/-x FASTA output files are enabled, the same fields
|
||||
(excluding @id) are appended to the definition line of corresponding
|
||||
FASTA records
|
||||
-v,-E expose (warn about) duplicate transcript IDs and other potential
|
||||
problems with the given GFF/GTF records
|
||||
121
src/gffread/script.sh
Normal file
121
src/gffread/script.sh
Normal file
@@ -0,0 +1,121 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
# unset flags
|
||||
[[ "$par_coding" == "false" ]] && unset par_coding
|
||||
[[ "$par_strict_range" == "false" ]] && unset par_strict_range
|
||||
[[ "$par_no_single_exon" == "false" ]] && unset par_no_single_exon
|
||||
[[ "$par_no_exon_attrs" == "false" ]] && unset par_no_exon_attrs
|
||||
[[ "$par_nc" == "false" ]] && unset par_nc
|
||||
[[ "$par_ignore_locus" == "false" ]] && unset par_ignore_locus
|
||||
[[ "$par_description" == "false" ]] && unset par_description
|
||||
[[ "$par_sort_alpha" == "false" ]] && unset par_sort_alpha
|
||||
[[ "$par_keep_genes" == "false" ]] && unset par_keep_genes
|
||||
[[ "$par_keep_attrs" == "false" ]] && unset par_keep_attrs
|
||||
[[ "$par_keep_exon_attrs" == "false" ]] && unset par_keep_exon_attrs
|
||||
[[ "$par_keep_comments" == "false" ]] && unset par_keep_comments
|
||||
[[ "$par_process_other" == "false" ]] && unset par_process_other
|
||||
[[ "$par_rm_stop_codons" == "false" ]] && unset par_rm_stop_codons
|
||||
[[ "$par_adj_cds_start" == "false" ]] && unset par_adj_cds_start
|
||||
[[ "$par_opposite_strand" == "false" ]] && unset par_opposite_strand
|
||||
[[ "$par_coding_status" == "false" ]] && unset par_coding_status
|
||||
[[ "$par_add_hasCDS" == "false" ]] && unset par_add_hasCDS
|
||||
[[ "$par_adj_stop" == "false" ]] && unset par_adj_stop
|
||||
[[ "$par_rm_noncanon" == "false" ]] && unset par_rm_noncanon
|
||||
[[ "$par_complete_cds" == "false" ]] && unset par_complete_cds
|
||||
[[ "$par_no_pseudo" == "false" ]] && unset par_no_pseudo
|
||||
[[ "$par_in_bed" == "false" ]] && unset par_in_bed
|
||||
[[ "$par_in_tlf" == "false" ]] && unset par_in_tlf
|
||||
[[ "$par_stream" == "false" ]] && unset par_stream
|
||||
[[ "$par_merge" == "false" ]] && unset par_merge
|
||||
[[ "$par_rm_redundant" == "false" ]] && unset par_rm_redundant
|
||||
[[ "$par_no_boundary" == "false" ]] && unset par_no_boundary
|
||||
[[ "$par_no_overlap" == "false" ]] && unset par_no_overlap
|
||||
[[ "$par_force_exons" == "false" ]] && unset par_force_exons
|
||||
[[ "$par_gene2exon" == "false" ]] && unset par_gene2exon
|
||||
[[ "$par_t_adopt" == "false" ]] && unset par_t_adopt
|
||||
[[ "$par_decode" == "false" ]] && unset par_decode
|
||||
[[ "$par_merge_exons" == "false" ]] && unset par_merge_exons
|
||||
[[ "$par_junctions" == "false" ]] && unset par_junctions
|
||||
[[ "$par_w_nocds" == "false" ]] && unset par_w_nocds
|
||||
[[ "$par_tr_cds" == "false" ]] && unset par_tr_cds
|
||||
[[ "$par_w_coords" == "false" ]] && unset par_w_coords
|
||||
[[ "$par_stop_dot" == "false" ]] && unset par_stop_dot
|
||||
[[ "$par_id_version" == "false" ]] && unset par_id_version
|
||||
[[ "$par_gtf_output" == "false" ]] && unset par_gtf_output
|
||||
[[ "$par_bed" == "false" ]] && unset par_bed
|
||||
[[ "$par_tlf" == "false" ]] && unset par_tlf
|
||||
[[ "$par_expose_dups" == "false" ]] && unset par_expose_dups
|
||||
[[ "$par_cluster_only" == "false" ]] && unset par_cluster_only
|
||||
|
||||
# if par_table is not empty, replace ";" with ","
|
||||
par_table=$(echo "$par_table" | tr ';' ',')
|
||||
|
||||
$(which gffread) \
|
||||
"$par_input" \
|
||||
${par_chr_mapping:+-m "$par_chr_mapping"} \
|
||||
${par_seq_info:+-s "$par_seq_info"} \
|
||||
-o "$par_outfile" \
|
||||
${par_force_exons:+--force-exons} \
|
||||
${par_gene2exon:+--gene2exon} \
|
||||
${par_t_adopt:+--t-adopt} \
|
||||
${par_decode:+-D} \
|
||||
${par_merge_exons:+-Z} \
|
||||
${par_genome:+-g "$par_genome"} \
|
||||
${par_junctions:+-j} \
|
||||
${par_spliced_exons:+-w "$par_spliced_exons"} \
|
||||
${par_w_add:+--w-add "$par_w_add"} \
|
||||
${par_w_nocds:+--w-nocds} \
|
||||
${par_spliced_cds:+-x "$par_spliced_cds"} \
|
||||
${par_tr_cds:+-y "$par_tr_cds"} \
|
||||
${par_w_coords:+-W} \
|
||||
${par_stop_dot:+-S} \
|
||||
${par_id_version:+-L} \
|
||||
${par_trackname:+-t "$par_trackname"} \
|
||||
${par_gtf_output:+-T} \
|
||||
${par_bed:+--bed} \
|
||||
${par_tlf:+--tlf} \
|
||||
${par_table:+--table "$par_table"} \
|
||||
${par_expose_dups:+-E} \
|
||||
${par_ids:+--ids "$par_ids"} \
|
||||
${par_nids:+--nids "$par_nids"} \
|
||||
${par_maxintron:+-i "$par_maxintron"} \
|
||||
${par_minlen:+-l "$par_minlen"} \
|
||||
${par_range:+-r "$par_range"} \
|
||||
${par_strict_range:+-R} \
|
||||
${par_jmatch:+--jmatch "$par_jmatch"} \
|
||||
${par_no_single_exon:+-U} \
|
||||
${par_coding:+-C} \
|
||||
${par_nc:+--nc} \
|
||||
${par_ignore_locus:+--ignore-locus} \
|
||||
${par_description:+-A} \
|
||||
${par_sort_alpha:+--sort-alpha} \
|
||||
${par_sort_by:+--sort-by "$par_sort_by"} \
|
||||
${par_keep_attrs:+-F} \
|
||||
${par_keep_exon_attrs:+--keep-exon-attrs} \
|
||||
${par_no_exon_attrs:+-G} \
|
||||
${par_attrs:+--attrs "$par_attrs"} \
|
||||
${par_keep_genes:+--keep-genes} \
|
||||
${par_keep_comments:+--keep-comments} \
|
||||
${par_process_other:+-O} \
|
||||
${par_rm_stop_codons:+-V} \
|
||||
${par_adj_cds_start:+-H} \
|
||||
${par_opposite_strand:+-B} \
|
||||
${par_coding_status:+-P} \
|
||||
${par_add_hasCDS:+--add-hasCDS} \
|
||||
${par_adj_stop:+--adj-stop} \
|
||||
${par_rm_noncanon:+-N} \
|
||||
${par_complete_cds:+-J} \
|
||||
${par_no_pseudo:+--no-pseudo} \
|
||||
${par_in_bed:+--in-bed} \
|
||||
${par_in_tlf:+--in-tlf} \
|
||||
${par_stream:+--stream} \
|
||||
${par_merge:+-M} \
|
||||
${par_dupinfo:+-d "$par_dupinfo"} \
|
||||
${par_cluster_only:+--cluster-only} \
|
||||
${par_rm_redundant:+-K} \
|
||||
${par_no_boundary:+-Q} \
|
||||
${par_no_overlap:+-Y}
|
||||
|
||||
111
src/gffread/test.sh
Executable file
111
src/gffread/test.sh
Executable file
@@ -0,0 +1,111 @@
|
||||
#!/bin/bash
|
||||
|
||||
## VIASH START
|
||||
## VIASH END
|
||||
|
||||
set -e
|
||||
|
||||
test_output_dir="${meta_resources_dir}/test_data/test_output"
|
||||
test_dir="${meta_resources_dir}/test_data"
|
||||
expected_output_dir="${meta_resources_dir}/test_data/output"
|
||||
|
||||
mkdir -p "$test_output_dir"
|
||||
|
||||
|
||||
################################################################################
|
||||
|
||||
echo "> Test 1 - Read annotation file, output GFF"
|
||||
|
||||
"$meta_executable" \
|
||||
--expose_dups \
|
||||
--outfile "$test_output_dir/ann_simple.gff" \
|
||||
--input "$test_dir/sequence.gff3"
|
||||
|
||||
|
||||
echo ">> Check if output exists"
|
||||
[ ! -f "$test_output_dir/ann_simple.gff" ] \
|
||||
&& echo "Output file test_output/ann_simple.gff does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "$test_output_dir/ann_simple.gff" ] \
|
||||
&& echo "Output file test_output/ann_simple.gff is empty" && exit 1
|
||||
|
||||
echo ">> Compare output to expected output"
|
||||
|
||||
# compare file expect lines starting with "#"
|
||||
diff <(grep -v "^#" "$expected_output_dir/ann_simple.gff") \
|
||||
<(grep -v "^#" "$test_output_dir/ann_simple.gff") || \
|
||||
(echo "Output file ann_simple.gff does not match expected output" && exit 1)
|
||||
|
||||
################################################################################
|
||||
|
||||
echo "> Test 2 - Read annotation file, output GTF"
|
||||
|
||||
"$meta_executable" \
|
||||
--gtf_output \
|
||||
--outfile "$test_output_dir/annotation.gtf" \
|
||||
--input "$test_dir/sequence.gff3"
|
||||
|
||||
echo ">> Check if output exists"
|
||||
[ ! -f "$test_output_dir/annotation.gtf" ] \
|
||||
&& echo "Output file test_output/annotation.gtf does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "$test_output_dir/annotation.gtf" ] \
|
||||
&& echo "Output file test_output/annotation.gtf is empty" && exit 1
|
||||
|
||||
echo ">> Compare output to expected output"
|
||||
diff "$expected_output_dir/annotation.gtf" "$test_output_dir/annotation.gtf" || \
|
||||
(echo "Output file annotation.gtf does not match expected output" && exit 1)
|
||||
|
||||
################################################################################
|
||||
|
||||
echo "> Test 3 - Generate fasta file from annotation file"
|
||||
|
||||
|
||||
"$meta_executable" \
|
||||
--genome "$test_dir/sequence.fasta" \
|
||||
--spliced_exons "$test_output_dir/transcripts.fa" \
|
||||
--outfile "$test_output_dir/output.gff" \
|
||||
--input "$test_dir/sequence.gff3"
|
||||
|
||||
echo ">> Check if output exists"
|
||||
[ ! -f "$test_output_dir/transcripts.fa" ] \
|
||||
&& echo "Output file transcripts.fa does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "$test_output_dir/transcripts.fa" ] \
|
||||
&& echo "Output file transcripts.fa is empty" && exit 1
|
||||
|
||||
echo ">> Compare output to expected output"
|
||||
diff "$expected_output_dir/transcripts.fa" "$test_output_dir/transcripts.fa" || \
|
||||
(echo "Output file transcripts.fa does not match expected output" && exit 1)
|
||||
|
||||
################################################################################
|
||||
|
||||
echo "> Test 4 - Generate table from GFF annotation file"
|
||||
|
||||
"$meta_executable" \
|
||||
--table "@id;@chr;@start;@end;@strand;@exons;Name;gene;product" \
|
||||
--outfile "$test_output_dir/annotation.tbl" \
|
||||
--input "$test_dir/sequence.gff3"
|
||||
|
||||
echo ">> Check if output exists"
|
||||
[ ! -f "$test_output_dir/annotation.tbl" ] \
|
||||
&& echo "Output file test_output/annotation.tbl does not exist" && exit 1
|
||||
|
||||
echo ">> Check if output is empty"
|
||||
[ ! -s "$test_output_dir/annotation.tbl" ] \
|
||||
&& echo "Output file test_output/annotation.tbl is empty" && exit 1
|
||||
|
||||
echo ">> Compare output to expected output"
|
||||
diff "$expected_output_dir/annotation.tbl" "$test_output_dir/annotation.tbl" || \
|
||||
(echo "Output file annotation.tbl does not match expected output" && exit 1)
|
||||
|
||||
################################################################################
|
||||
|
||||
rm -r "$test_output_dir"
|
||||
|
||||
echo "> All tests successful"
|
||||
|
||||
exit 0
|
||||
38
src/gffread/test_data/README.md
Normal file
38
src/gffread/test_data/README.md
Normal file
@@ -0,0 +1,38 @@
|
||||
## GffRead usage examples
|
||||
|
||||
GffRead can be used to simply read an annotation file in a GFF format, and print it in either GFF3 (default) or
|
||||
GTF2 format (with the -T option), while discarding any non-trasncript features and optional attributes.
|
||||
It can also report some potential issues found in the input GFF records. The command line for such a quick GFF/GTF
|
||||
file cleanup would be:
|
||||
```
|
||||
gffread -E annotation.gff -o ann_simple.gff
|
||||
```
|
||||
|
||||
This will create a minimalist GFF3 re-formatting of the transcript records found in the input file (`annotation.gff` in this example).
|
||||
The -E option directs GffRead to "expose" (display warnings about) any potential formatting issues
|
||||
encountered while parsing the input file.
|
||||
|
||||
In order to obtain the GTF2 version of the same transcript records, the `-T` option should be added:
|
||||
```
|
||||
gffread annotation.gff -T -o annotation.gtf
|
||||
```
|
||||
|
||||
GffRead can be used to generate a FASTA file with the DNA sequences for all transcripts in a GFF file. For this operation
|
||||
a fasta file with the genomic sequences has to be provided as well. This can be accomplished with a command line like this:
|
||||
```
|
||||
gffread -w transcripts.fa -g genome.fa annotation.gff
|
||||
```
|
||||
The file `genome.fa` in this example would be a multi-fasta file with the chromosome/contig sequences of the target genome.
|
||||
This also requires that every contig or chromosome name found in the 1st column of the input GFF file
|
||||
(`annotation.gff` in this example) must have a corresponding sequence entry in the `genome.fa` file.
|
||||
|
||||
|
||||
```
|
||||
gffread --table @id,@chr,@start,@end,@strand,@exons,Name,gene,product \
|
||||
-o annotation.tbl annotation.gff
|
||||
```
|
||||
This shows how the `--table` option can make a tab delimited table out of a GFF3 input.
|
||||
|
||||
The `output` directory contains all the output files that should be generated by the above examples.
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user