YOLO-Master  by isLinXu

Real-time object detection accelerated by adaptive computation

Created 2 weeks ago

New!

261 stars

Top 97.5% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

YOLO-Master tackles Real-Time Object Detection (RTOD) inefficiencies by replacing static computation with instance-conditional adaptive processing. Targeting researchers and engineers, it employs Mixture-of-Experts (MoE) to dynamically allocate resources, yielding superior accuracy-speed trade-offs, especially on complex scenes.

How It Works

The core is the Efficient Sparse MoE (ES-MoE) block within a YOLO-like framework. A lightweight dynamic routing network guides expert specialization during training and selects relevant experts during inference. This "compute-on-demand" approach optimizes resource allocation per input, enhancing detection performance while minimizing overhead.

Quick Start & Requirements

Installation involves cloning the repo, setting up a Python 3.11 environment, and installing dependencies (pip install -r requirements.txt, pip install -e .). FlashAttention requires CUDA for faster training. Key resources: GitHub repo (https://github.com/isLinXu/YOLO-Master), MoE module docs, Gradio demo (python app.py). GPU acceleration recommended.

Highlighted Details

  • Achieves 42.4% AP with 1.62ms latency on MS COCO, outperforming YOLOv13-N by +0.8% mAP and 17.8% faster inference.
  • Shows significant gains on challenging dense scenes while maintaining efficiency.
  • Supports Object Detection, Classification, and experimental Instance Segmentation.
  • Features Sparse SAHI for accelerated small object detection and CW-NMS for mAP vs. speed tuning.

Maintenance & Community

Developed by Tencent Youtu Lab and Singapore Management University, the project welcomes community contributions via GitHub issues/PRs. Recent updates focus on performance enhancements and new features.

Licensing & Compatibility

Licensed under GNU Affero General Public License v3.0 (AGPL-3.0). This strong copyleft license requires derivative works distributed over a network to be open-sourced under AGPL-3.0, potentially restricting closed-source commercial integration.

Limitations & Caveats

Instance Segmentation and Pose Estimation are experimental. The AGPL-3.0 license imposes significant obligations for network-distributed services, requiring careful consideration for commercial adoption.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
16
Star History
262 stars in the last 16 days

Explore Similar Projects

Starred by François Chollet François Chollet(Author of Keras; Cofounder of Ndea, ARC Prize), Chaoyu Yang Chaoyu Yang(Founder of Bento), and
13 more.

neon by NervanaSystems

0%
4k
Deep learning framework (discontinued)
Created 11 years ago
Updated 5 years ago
Feedback? Help us improve.