Discover and explore top open-source AI tools and projects—updated daily.
Agentic search for understanding extra-long videos
Top 91.6% on SourcePulse
Summary Deep Video Discovery (DVD) is a research-focused question-answering agent designed for understanding extra-long videos. It leverages large language models (LLMs) and an agentic search approach to interpret extensive video content, achieving state-of-the-art performance on long-form video benchmarks. DVD targets researchers and power users needing efficient, in-depth analysis of lengthy video sources.
How It Works
DVD treats segmented video clips as exploration environments, employing autonomous planning and reasoning to dynamically formulate strategies. It iteratively extracts information using multi-granular tools and summarizes observations. A key innovation is its global_browse_tool
, which uses textual descriptions of video clips for a more efficient global overview, rather than raw pixels.
Quick Start & Requirements
Installation involves cloning the repository and running pip install -r requirements.txt
. Configuration, including API keys, is done via config.py
. Usage is demonstrated with python local_run.py <youtube_url> "question"
. An optional gradio
installation is available for the demo.
Highlighted Details
lite_mode
for subtitle-only analysis, useful for podcasts.Maintenance & Community This is an initial release with TODO items for future development. No community channels or specific contributor details are provided in the README snippet.
Licensing & Compatibility The provided README snippet does not specify a software license. This omission requires further investigation for adoption, especially concerning commercial use.
Limitations & Caveats Outstanding TODOs include implementing an MCP server and releasing evaluation data. Its "deep-research style" designation suggests it may be experimental and not production-ready.
2 days ago
Inactive