CLI tool for de-novo transposable element (TE) annotation and benchmarking
Top 74.2% on sourcepulse
EDTA (Extensive de novo TE Annotator) is a comprehensive pipeline for automated, de novo transposable element (TE) annotation across whole genomes. It is designed for researchers and bioinformaticians needing to generate high-quality, non-redundant TE libraries and perform accurate genome-wide TE annotations, offering benchmarking capabilities for new TE libraries and methods.
How It Works
EDTA integrates multiple TE detection tools (e.g., RepeatModeler, LTR_FINDER, HelitronScanner) and employs a multi-step process to filter raw TE candidates, reduce redundancy, and classify TEs. It leverages curated TE libraries and optional CDS information to refine annotations, minimize false positives, and improve the accuracy of TE identification, particularly for under-annotated TE types. The pipeline can also generate masked genome files that exclude TEs from gene annotation regions to improve gene prediction quality.
Quick Start & Requirements
conda/mamba
(mamba install -c conda-forge -c bioconda edta
) or Singularity/Docker for HPC/macOS users. A yml
file installation is also available.perl EDTA.pl --genome genome.fa [options]
Highlighted Details
lib-test.pl
) for comparing TE annotation performance.panEDTA
functionality for pan-genome TE analysis.Maintenance & Community
The project is actively developed by The Ou lab at Ohio State University, Deng's Bioinformatics Engineering Team, and Joseph Guhlin's lab. Community support is available via GitHub Issues.
Licensing & Compatibility
The README does not explicitly state a license. However, the inclusion of dependencies like RepeatMasker and RepeatModeler suggests potential licensing considerations for commercial use.
Limitations & Caveats
Sequence names must be short (<=13 characters) and simple. Docker usage has specific path limitations. The pipeline can be resource-intensive, particularly RepeatMasker/RepeatModeler steps. The README does not specify a license, which could impact commercial adoption.
1 month ago
1 week