nanoowl  by NVIDIA-AI-IOT

OWL-ViT optimization project for real-time object detection

created 1 year ago
352 stars

Top 80.3% on sourcepulse

GitHubView on GitHub
Project Summary

NanoOWL optimizes the OWL-ViT model for real-time inference on NVIDIA Jetson Orin platforms using NVIDIA TensorRT. It enables zero-shot object detection and classification via text prompts, and introduces a novel "tree detection" pipeline for nested detection and classification, targeting developers and researchers working with edge AI and computer vision on NVIDIA hardware.

How It Works

NanoOWL leverages NVIDIA TensorRT to optimize OWL-ViT for efficient execution on Jetson Orin devices. This optimization involves converting the model to a TensorRT engine, which significantly accelerates inference. The "tree detection" pipeline extends OWL-ViT's capabilities by combining it with CLIP, allowing for hierarchical and nested detection based on complex text descriptions, offering a flexible approach to open-vocabulary recognition.

Quick Start & Requirements

  • Install dependencies: PyTorch, torch2trt, Transformers, TensorRT.
  • Build TensorRT engine: python3 -m nanoowl.build_image_encoder_engine data/owl_image_encoder_patch32.engine
  • Run example: python3 examples/owl_predict.py --prompt="[an owl, a glove]" --threshold=0.1 --image_encoder_engine=../data/owl_image_encoder_patch32.engine
  • Requires NVIDIA Jetson Orin platform for optimal performance.
  • Official examples and setup instructions are available in the repository.

Highlighted Details

  • Real-time inference on Jetson Orin Nano and AGX Orin platforms.
  • Supports nested detection and classification via text prompts using a "tree detection" pipeline.
  • Can be combined with NanoSAM for zero-shot instance segmentation.
  • Includes examples for basic prediction, tree prediction, and live camera feed demonstration.

Maintenance & Community

  • Project is hosted on GitHub by NVIDIA-AI-IOT.
  • Related projects include NanoSAM, Jetson Introduction to Knowledge Distillation Tutorial, Jetson Generative AI Playground, and Jetson Containers.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

  • Performance benchmarks (FPS) for the OWL-ViT (ViT-B/32) model on Jetson Orin Nano are listed as "TBD".
  • The project focuses specifically on NVIDIA Jetson Orin platforms, limiting its applicability to other hardware.
Health Check
Last commit

5 months ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
1
Star History
26 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.