mobilenetv2-yolov3 by fsx950223

Object detection with efficient backbones

Created 6 years ago

294 stars

Top 90.1% on SourcePulse

Project Summary

This project provides TensorFlow 2 implementations of the YOLOv3 object detection model, leveraging lightweight backbones such as MobileNetV2 and EfficientNet. It is designed for researchers and developers seeking efficient object detection solutions that balance accuracy with reduced computational cost, making them suitable for a wider range of deployment scenarios. The integration of these efficient backbones offers a notable benefit in terms of inference speed and potential for edge deployment.

How It Works

The core of the project is the YOLOv3 detection framework, which replaces the traditional Darknet-53 backbone with either MobileNetV2 or EfficientNet architectures. This substitution is key to achieving improved efficiency. The implementation utilizes TensorFlow 2, benefiting from its tf.data pipelines for optimized data loading and preprocessing. The project supports various loss functions, including MSE, GIOU, and adversarial loss, alongside training enhancements like cosine learning rate and auto-augmentation.

Quick Start & Requirements

Installation is straightforward via pip install -r requirements.txt. The project requires TensorFlow 2 and data formatted into TFRecords. Training can be initiated with python main.py --mode=TRAIN --train_dataset_glob=<your dataset glob> --epochs=50. For inference on images, use python main.py --mode=IMAGE --model=<your_model_path>. The project also supports exporting models for TensorFlow Serving. Pre-trained models for VOC2007 and VOC2007+2012 datasets are available for download. A live demo using TensorFlow.js is accessible at https://fsx950223.github.io/mobilenetv2-yolov3/tfjs/.

Highlighted Details

Performance: The MobileNetV2-Yolov3 model (416x416 input) achieves a mAP of 0.6696 on VOC2007, with GPU inference at 19ms (GTX1080Ti) and a model size of 37M.
Performance: The EfficientNet-Yolov3 model (380x380 input) trained on VOC2007+2012 yields a mAP of 0.7689, with GPU inference at 23ms and a model size of 77M.
Features: Supports TensorFlow 2, tf.data pipelines, Multi-GPU training, TensorRT optimization, and TensorFlow Serving integration with Python and Java clients.
Training: Incorporates Cosine learning rate scheduling and Auto Augment for improved training stability and performance.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord or Slack), or project roadmaps were provided in the README.

Licensing & Compatibility

The README does not specify a software license, which may impact commercial use or integration into closed-source projects.

Limitations & Caveats

Support for TPUs is not implemented. The functionality to convert models to TensorFlow Lite format is also missing. TensorBoard external callback integration is not yet available.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days