DeepVideoDiscovery  by microsoft

Agentic search for understanding extra-long videos

Created 4 months ago
286 stars

Top 91.6% on SourcePulse

GitHubView on GitHub
Project Summary

Summary Deep Video Discovery (DVD) is a research-focused question-answering agent designed for understanding extra-long videos. It leverages large language models (LLMs) and an agentic search approach to interpret extensive video content, achieving state-of-the-art performance on long-form video benchmarks. DVD targets researchers and power users needing efficient, in-depth analysis of lengthy video sources.

How It Works DVD treats segmented video clips as exploration environments, employing autonomous planning and reasoning to dynamically formulate strategies. It iteratively extracts information using multi-granular tools and summarizes observations. A key innovation is its global_browse_tool, which uses textual descriptions of video clips for a more efficient global overview, rather than raw pixels.

Quick Start & Requirements Installation involves cloning the repository and running pip install -r requirements.txt. Configuration, including API keys, is done via config.py. Usage is demonstrated with python local_run.py <youtube_url> "question". An optional gradio installation is available for the demo.

Highlighted Details

  • Achieves state-of-the-art performance on multiple long video benchmarks (e.g., LVBench).
  • Supports OpenAI API and Azure OpenAI API.
  • Features a lite_mode for subtitle-only analysis, useful for podcasts.
  • Accepted by NeurIPS 2025, indicating strong research validation.

Maintenance & Community This is an initial release with TODO items for future development. No community channels or specific contributor details are provided in the README snippet.

Licensing & Compatibility The provided README snippet does not specify a software license. This omission requires further investigation for adoption, especially concerning commercial use.

Limitations & Caveats Outstanding TODOs include implementing an MCP server and releasing evaluation data. Its "deep-research style" designation suggests it may be experimental and not production-ready.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
193 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.