ViNote by zrt-ai-lab

Video to Everything: AI knowledge extraction and processing

Created 8 months ago

437 stars

Top 67.5% on SourcePulse

Project Summary

Summary

ViNote is an open-source AI-powered tool designed for extracting and organizing knowledge from video content. It targets engineers, researchers, and power users seeking to efficiently transform video links or local files into structured notes, knowledge cards, and mind maps, with integrated AI Q&A and translation capabilities. The core benefit is enabling users to leverage video as a structured knowledge asset.

How It Works

ViNote employs a modular AI architecture centered around the "ViNoter Super Agent," which facilitates conversational interaction for all video processing tasks. It leverages Faster-Whisper for local audio transcription and integrates with OpenAI APIs for advanced natural language understanding, summarization, and Q&A. A key differentiator is its implementation of the ANP (Agent Network Protocol), enabling decentralized agent collaboration for cross-platform video search and processing. The system automates workflows from video ingestion to structured knowledge output.

Quick Start & Requirements

The recommended installation method is via Docker Compose (docker compose up -d), which bundles Python, FFmpeg, and Node.js dependencies. Alternatively, local installation requires Python 3.10+, FFmpeg, and the uv package manager. A critical prerequisite is an OpenAI API key (OPENAI_API_KEY, OPENAI_BASE_URL, OPENAI_MODEL). For downloading Bilibili content, user-specific browser cookies must be exported and configured. Setup time is minimal with Docker; local setup involves dependency installation and configuration. Official documentation and the GitHub repository provide detailed setup guides.

Highlighted Details

ViNoter Super Agent: Enables conversational control over video search, transcription, note generation, and translation across platforms like YouTube and Bilibili.
ANP Protocol Integration: Supports decentralized agent networking and collaboration for advanced search and processing capabilities.
Multi-Format Output: Generates structured Markdown notes, diverse knowledge card types (concept, point, comparison), and interactive, zoomable mind maps via Markmap.
Broad Input Support: Accepts both online video URLs (YouTube, Bilibili) and local video files (MP4, AVI, MOV, MKV, etc.).
Local Transcription: Utilizes Faster-Whisper for efficient, local audio-to-text conversion.

Maintenance & Community

The project is maintained by the "ViNote Team" with active development indicated by frequent version updates (e.g., v1.3.1 released Feb 2026). Feedback and bug reporting are managed via GitHub Issues. Contact is available via email (864410260@qq.com). A development roadmap outlines future features.

Licensing & Compatibility

ViNote is released under the permissive MIT License. This license allows for broad adoption, including commercial use and integration into closed-source projects, with minimal restrictions beyond attribution.

Limitations & Caveats

Operation is contingent on a valid OpenAI API key, introducing external service dependency and potential costs. Downloading content from Bilibili necessitates exporting and managing browser cookies, which can be a user-facing friction point and requires periodic updates. The ANP protocol demo, while powerful, adds significant setup complexity involving decentralized identity generation and multi-service orchestration. Performance may vary based on hardware, especially for AI model inference.

Health Check

Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

15 stars in the last 30 days