YOLO-Master by Tencent

Real-time object detection accelerated by adaptive computation

Created 2 months ago

369 stars

Top 76.9% on SourcePulse

Project Summary

Summary

YOLO-Master tackles Real-Time Object Detection (RTOD) inefficiencies by replacing static computation with instance-conditional adaptive processing. Targeting researchers and engineers, it employs Mixture-of-Experts (MoE) to dynamically allocate resources, yielding superior accuracy-speed trade-offs, especially on complex scenes.

How It Works

The core is the Efficient Sparse MoE (ES-MoE) block within a YOLO-like framework. A lightweight dynamic routing network guides expert specialization during training and selects relevant experts during inference. This "compute-on-demand" approach optimizes resource allocation per input, enhancing detection performance while minimizing overhead.

Quick Start & Requirements

Installation involves cloning the repo, setting up a Python 3.11 environment, and installing dependencies (pip install -r requirements.txt, pip install -e .). FlashAttention requires CUDA for faster training. Key resources: GitHub repo (https://github.com/isLinXu/YOLO-Master), MoE module docs, Gradio demo (python app.py). GPU acceleration recommended.

Highlighted Details

Achieves 42.4% AP with 1.62ms latency on MS COCO, outperforming YOLOv13-N by +0.8% mAP and 17.8% faster inference.
Shows significant gains on challenging dense scenes while maintaining efficiency.
Supports Object Detection, Classification, and experimental Instance Segmentation.
Features Sparse SAHI for accelerated small object detection and CW-NMS for mAP vs. speed tuning.

Maintenance & Community

Developed by Tencent Youtu Lab and Singapore Management University, the project welcomes community contributions via GitHub issues/PRs. Recent updates focus on performance enhancements and new features.

Licensing & Compatibility

Licensed under GNU Affero General Public License v3.0 (AGPL-3.0). This strong copyleft license requires derivative works distributed over a network to be open-sourced under AGPL-3.0, potentially restricting closed-source commercial integration.

Limitations & Caveats

Instance Segmentation and Pose Estimation are experimental. The AGPL-3.0 license imposes significant obligations for network-distributed services, requiring careful consideration for commercial adoption.

YOLO-Master by Tencent

Explore Similar Projects

GRIN-MoE by microsoft

mHC.cu by AndreSlavescu

ort by pytorch

onnxruntime-training-examples by microsoft

optimum-neuron by huggingface

RyzenAI-SW by amd

FlagPerf by flagos-ai

yolov4-tiny-pytorch by bubbliiiing

optimum by huggingface

neon by NervanaSystems

openvino by openvinotoolkit

pytorch-lightning by Lightning-AI