Curated list of open-source data annotation/labeling tools
Top 53.7% on sourcepulse
This repository is a curated list of open-source data annotation and labeling tools, categorized by data modality (text, images, audio, video, time series, multi-modal). It aims to help machine learning practitioners discover and evaluate tools that fit their MLOps workflows, particularly for data-centric approaches.
How It Works
The project functions as a community-driven directory, compiling tools based on three core criteria: open-source license, active maintenance, and fitness for purpose. It provides a structured overview of available tools, facilitating discovery and comparison for users involved in data annotation and labeling.
Quick Start & Requirements
This is a curated list, not a software package. To use the tools, refer to their individual project pages.
Highlighted Details
Maintenance & Community
The list is maintained by ZenML and welcomes community contributions via Pull Requests. Users are encouraged to join the ZenML Slack for discussions and potential collaborations on MLOps integrations.
Licensing & Compatibility
The repository itself is not licensed as software. The tools listed have various licenses, including Apache-2, MIT, BSD, GPL-3, AGPL-3, ELv2, Custom, and Unknown. Compatibility for commercial use depends on the specific license of each tool.
Limitations & Caveats
The list's quality and completeness depend on community contributions. Some tools have "Unknown" or "N/A" licenses, and the "active maintenance" status may vary. The "Description" field is brief, requiring users to visit individual project pages for detailed functionality.
2 months ago
Inactive