Discover and explore top open-source AI tools and projects—updated daily.
Oqura-aiAI-powered local document research and reporting
Top 99.4% on SourcePulse
Summary
Deepdoc is a research tool designed for local knowledge bases, enabling users to conduct in-depth analysis of their own documents (PDF, DOCX, JPG, TXT, etc.) instead of relying on internet searches. It automates the process of extracting insights, organizing findings, and generating structured markdown reports, making it valuable for researchers and power users who need to quickly synthesize information from large local datasets.
How It Works
The system ingests local documents, extracts text, and segments it into page-wise chunks stored in a vector database for semantic search. Users provide an instruction query, guiding the generation of a content structure. Research agents then iteratively generate knowledge for report sections by creating research queries, searching the local data, and refining results through reflection agents. Finally, a report writer compiles section content into a comprehensive markdown report. This approach allows for a systematic, agent-driven exploration of local data.
Quick Start & Requirements
uv, install dependencies with uv pip install -r requirements.txt.uv (for environment/dependency management), Docker and Docker Compose (for Qdrant vector database), API keys for Mistral, Tavily, and OpenAI.EMBEDDING_MODEL, QDRANT_URL) in a .env file. Customize LLM and thread configurations in configuration.py.docker-compose up --build, then run the application with python main.py.uv installation: official uv GitHub repository.Highlighted Details
Maintenance & Community
The project is authored by Swaraj Biswal and Swadhin Biswal. Contributions are welcomed via issues or pull requests. No specific community channels (e.g., Discord, Slack) or sponsorship details are provided in the README.
Licensing & Compatibility
Licensed under the MIT License. This license is permissive and generally compatible with commercial use and closed-source linking.
Limitations & Caveats
The tool's functionality is dependent on the user providing valid API keys for external LLM and search services (Mistral, Tavily, OpenAI). Setup requires familiarity with uv, Docker, and environment variable management. The effectiveness of the research output is tied to the quality of the input documents and the configuration of the research agents.
1 month ago
Inactive
aryn-ai
NVIDIA-AI-Blueprints
Future-House