VisioFirm  by OschAI

AI-powered annotation tool for accelerating computer vision workflows

Created 2 months ago
373 stars

Top 75.9% on SourcePulse

GitHubView on GitHub
Project Summary

VisioFirm is an open-source, AI-assisted annotation tool designed to accelerate computer vision dataset labeling. It targets researchers, data scientists, and ML engineers working with large image and video datasets, offering significant time savings (up to 80%) through semi-automated pre-annotation and label propagation. The tool streamlines workflows with an intuitive web interface, powerful backend, and support for various annotation types and popular model formats.

How It Works

VisioFirm leverages state-of-the-art AI models for pre-annotation, including OpenAI CLIP for classification, SAM2 for segmentation, YOLO (v5-v12) for detection, and Grounding DINO for zero-shot object grounding. For video annotation, it offers advanced label propagation via a SAM2-powered "SmartPropagator" or various OpenCV trackers, enabling frame-to-frame consistency. The system supports cross-domain annotation, allowing detection models to generate segmentation masks or vice-versa. Its backend is migrated to FastAPI for improved performance, and it includes a Python API for pipeline integration.

Quick Start & Requirements

  • Installation: pip install -U visiofirm
  • Development Install: Clone the repository (git clone https://github.com/OschAI/VisioFirm.git), navigate to the directory, and run pip install -e .
  • Launch: Execute visiofirm in the terminal.
  • Prerequisites: Python 3.10+. For v1, users must clear or rename existing cache directories (~/.cache/visiofirm_cache, ~/Library/Caches/visiofirm_cache, or %LOCALAPPDATA%\visiofirm_cache) before first run to avoid conflicts.
  • Links: GitHub Repository

Highlighted Details

  • AI-Driven Pre-Annotation: Utilizes YOLO, SAM2, Grounding DINO, and CLIP to automate object detection, segmentation, and classification, claiming up to 80% manual effort reduction.
  • Video Annotation & Label Propagation: Features SAM2-based SmartPropagator and multiple OpenCV trackers for efficient video labeling.
  • Broad Model Support: Integrates seamlessly with Ultralytics YOLO models (v5-v12) and YOLOv8-world for open-vocab pre-annotation.
  • Cross-Domain Annotation: Enables using detection models for segmentation pre-labeling and vice-versa.
  • WebGPU Annotation: Offers interactive, browser-based SAM2 segmentation via WebGPU for faster computing.

Maintenance & Community

The project is maintained by Safouane El Ghazouali. Bug reports and feature requests can be submitted via the GitHub Issues page. A Discord community and a documentation website are planned for the future.

Licensing & Compatibility

VisioFirm itself is licensed under the Apache 2.0 license. However, it integrates third-party models with different licenses: Ultralytics YOLO uses AGPL-3.0, while SAM2 and GroundingDINO use Apache 2.0 and BSD 3-Clause. The AGPL-3.0 license for Ultralytics YOLO may impose copyleft restrictions on derivative works or linked applications, potentially impacting closed-source commercial use.

Limitations & Caveats

Official documentation and community support (Discord) are listed as "SOON". The v1 release requires manual cache directory management to ensure proper initialization. The AGPL-3.0 license of a key dependency (Ultralytics YOLO) may present compatibility challenges for certain commercial or closed-source applications.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
8
Star History
41 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.