Discover and explore top open-source AI tools and projects—updated daily.
via007Bilibili collections transformed into a conversational knowledge base
Top 65.8% on SourcePulse
Bilibili RAG transforms your Bilibili watch history into a searchable, conversational knowledge base. It targets users who collect interviews, lectures, or technical videos and wish to organize and interact with this content effectively. The project offers a solution for personal knowledge management by making archived video content easily retrievable and queryable.
How It Works
The system employs a Retrieval-Augmented Generation (RAG) pipeline. It begins with Bilibili login to access user favorites, followed by audio extraction and Automatic Speech Recognition (ASR) to transcribe spoken content. Generated vector embeddings enable semantic search, powering a RAG-based conversational Q&A interface. Data is persisted locally using SQLite for metadata and ChromaDB for vector storage. A fallback mechanism handles inaccessible audio URLs by downloading, transcoding, and uploading to DashScope for ASR.
Quick Start & Requirements
ffmpeg must be installed and accessible in the system's PATH.conda activate bilibili-rag.pip install -r requirements.txt..env.example to .env and providing a DashScope API Key.python -m uvicorn app.main:app --reload (API documentation available at http://localhost:8000/docs).frontend/, then run npm install followed by npm run dev (UI accessible at http://localhost:3000).Highlighted Details
Maintenance & Community
No specific details regarding maintainers, community channels (e.g., Discord, Slack), or a public roadmap are provided in the README. A "TodoList" section indicates planned future development.
Licensing & Compatibility
The project is released under the MIT license. This license is permissive and generally allows for commercial use and integration into closed-source projects without significant restrictions.
Limitations & Caveats
Direct audio URL access may fail due to Bilibili's authentication, expiration, or regional restrictions. The ASR fallback process requires ffmpeg to be correctly installed and configured. Usage of DashScope services (LLM, Embedding, ASR) incurs costs, though free tiers may suffice for initial testing. Some planned features, such as conversation storage and support for multi-part videos, are not yet implemented. Test scripts located in the test/ directory require relocation to the project root before execution.
2 weeks ago
Inactive
langchain-ai
reorproject