LLM agent tool for web crawling, indexing, and reasoning
Top 67.6% on sourcepulse
Doctor is a system designed to equip LLM agents with the ability to discover, crawl, and index web content, enabling more up-to-date reasoning and code generation. It targets developers and researchers building AI agents that require access to current information from the web.
How It Works
Doctor orchestrates a pipeline involving web crawling (crawl4ai), text chunking (LangChain), embedding generation (OpenAI via litellm), and data storage with vector search (DuckDB). These components are managed via a unified database class and asynchronous task processing using Redis. The indexed data and search capabilities are exposed through a FastAPI web service, which also serves as an MCP server for seamless integration with LLM agents.
Quick Start & Requirements
OPENAI_API_KEY
environment variable, and run docker compose up
.Highlighted Details
Maintenance & Community
The project is actively maintained with Python tests and code coverage reports. Links to community channels or roadmaps are not explicitly provided in the README.
Licensing & Compatibility
Licensed under the MIT License, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
Requires an OpenAI API key for embedding generation, which may incur costs. The system relies on Docker Compose for deployment, and specific version requirements for Python (3.10+) and Docker are noted.
2 months ago
Inactive