X-AnyLabeling  by CVHub520

Annotation tool with AI assistance for labeling visual data

created 2 years ago
6,175 stars

Top 8.5% on sourcepulse

GitHubView on GitHub
Project Summary

X-AnyLabeling is an AI-powered data annotation tool designed for visual data engineers and researchers. It streamlines the labeling process for images and videos across a wide range of tasks, including object detection, segmentation, tracking, and OCR, by integrating advanced models like Segment Anything (SAM) and Grounding-DINO for accelerated and semi-automatic annotation.

How It Works

The tool leverages a modular architecture that supports various annotation types and integrates multiple AI models for assisted labeling. It processes both images and videos, with GPU acceleration for faster inference. Users can import and export data in common formats (COCO, YOLO, MOT, etc.) and customize or integrate their own models, enabling flexible workflows for complex annotation needs.

Quick Start & Requirements

  • Installation: Typically via pip install -r requirements.txt or Docker.
  • Prerequisites: Python 3.8+, PyTorch, and potentially CUDA for GPU acceleration. Specific model requirements vary.
  • Resources: GPU recommended for AI-assisted features.
  • Docs: Installation & Quickstart

Highlighted Details

  • Supports over 20 AI models for tasks like object detection (RF-DETR, Grounding-DINO), segmentation (SAM variants), pose estimation, and OCR.
  • Includes multimodal chatbot support for image dataset annotation.
  • Offers advanced video annotation features, including tracking by detection (HBB, OBB) and instance segmentation.
  • Supports a wide array of annotation formats and types, from bounding boxes and polygons to key information extraction.

Maintenance & Community

The project is actively maintained by CVHub. Community interaction is primarily through GitHub issues.

Licensing & Compatibility

  • License: GPL-3.0.
  • Compatibility: The GPL-3.0 license may impose copyleft restrictions on derivative works, potentially impacting commercial or closed-source integrations.

Limitations & Caveats

The GPL-3.0 license requires derived works to also be open-sourced under the same license, which may be a consideration for commercial use. The extensive model support implies potentially complex dependency management.

Health Check
Last commit

19 hours ago

Responsiveness

1 day

Pull Requests (30d)
5
Issues (30d)
40
Star History
814 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Simon Willison Simon Willison(Author of Django), and
2 more.

LAVIS by salesforce

0.2%
11k
Library for language-vision AI research
created 2 years ago
updated 8 months ago
Feedback? Help us improve.