video-search-and-summarization  by NVIDIA-AI-Blueprints

Video analytics and Q&A powered by generative AI

Created 1 year ago
304 stars

Top 87.9% on SourcePulse

GitHubView on GitHub
Project Summary

This NVIDIA AI Blueprint provides a framework for ingesting and analyzing massive video datasets to generate insights, summaries, and enable interactive Q&A. It targets video analysts, IT engineers, and GenAI/ML engineers seeking to build custom video analytics AI agents, offering a plug-and-play approach with extensive customization options for advanced users. The blueprint leverages NVIDIA's NIM microservices and generative AI models to unlock new possibilities in video understanding for applications like smart space monitoring and warehouse automation.

How It Works

The system processes video data through an ingestion pipeline that decodes segments, selects frames, and generates detailed captions using a Vision-Language Model (VLM). Concurrently, computer vision metadata and audio transcriptions are produced. This enriched data is indexed into vector and graph databases. The core intelligence resides in the Context-Aware Retrieval-Augmented Generation (CA-RAG) module, which combines Vector RAG and Graph-RAG. This dual-RAG approach enhances temporal reasoning, anomaly detection, and multi-hop question-answering by retrieving context from both databases, enabling deeper understanding and efficient management of extensive video data.

Quick Start & Requirements

  • Installation: Deployment options include Docker Compose, Helm charts (for x86 platforms), and Brev Launchable notebooks.
  • Prerequisites: Requires an NVIDIA AI Enterprise developer license for local NIM hosting, API catalog keys, specific NVIDIA drivers (e.g., 580.65.06+), CUDA (13.0+), NVIDIA Container Toolkit (1.13.5+), and Docker (27.5.1+). Helm deployments require Kubernetes v1.31.2+ and NVIDIA GPU Operator v23.9+.
  • Hardware: Minimum GPU requirements vary significantly by deployment type and model configuration, ranging from a single GPU (e.g., 1x H100/A100 80GB) for reduced compute or single-GPU deployments, up to 8x high-end GPUs (e.g., 8x H200/A100 80GB) for default local deployments. Remote deployments require a minimum 8GB VRAM GPU.
  • Documentation: Detailed instructions are available at the official documentation link provided in the README.

Highlighted Details

  • Powered by NVIDIA NIM microservices, utilizing models like Cosmos-Reason1-7B, Llama-3.1-70b-instruct, and Llama-3.2-nv-embedqa-1b-v2.
  • Employs Context-Aware Retrieval-Augmented Generation (CA-RAG) integrating both Vector and Graph RAG for advanced video understanding.
  • Offers flexible deployment strategies including Docker Compose for development, Helm for production, and Brev Launchable for quick starts.
  • Supports comprehensive video analysis including summarization, Q&A, and alert generation.

Maintenance & Community

The provided README does not detail specific community channels (like Discord or Slack), active maintainers, or sponsorship information.

Licensing & Compatibility

The project license is available via a LICENSE file. As an NVIDIA AI Blueprint, usage may be tied to the NVIDIA AI Enterprise license, particularly for accessing proprietary models and services.

Limitations & Caveats

The VSS Engine 2.4.0 container has known CVEs (CVE-2024-8966, CVE-2025-4565, CVE-2025-3887), though the README states these do not affect VSS due to specific dependency versions or usage patterns. However, CVE-2025-3887 related to the GStreamer H.265 codec parser requires users to ensure malicious streams are not added or to build patched GStreamer libraries. Helm deployments are exclusively supported on x86 platforms.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
10
Star History
42 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
2 more.

awesome-llm-apps by Shubhamsaboo

1.2%
75k
LLM app collection with AI agents and RAG examples
Created 1 year ago
Updated 4 days ago
Feedback? Help us improve.