CoexistAI  by SPThole

Research assistant framework for automating workflows

Created 3 months ago
270 stars

Top 95.2% on SourcePulse

GitHubView on GitHub
Project Summary

CoexistAI is a modular, developer-friendly framework designed to automate and enhance research workflows. It empowers users to build sophisticated research assistants capable of searching, summarizing, and interacting with diverse data sources like the web, Reddit, YouTube, local files, and code repositories, all orchestrated through LLMs and the Model Communication Protocol (MCP). The primary benefit is a unified, programmable interface for complex information retrieval and synthesis tasks.

How It Works

CoexistAI employs a pluggable architecture, allowing integration with various LLMs (Google Gemini, OpenAI, Ollama) and embedding models. It leverages tools like SearxNG for privacy-focused web search aggregation and BM25 for improved Reddit search relevance. Core functionalities are exposed via a FastAPI server and MCP, enabling seamless integration into existing agentic systems or standalone use. Its asynchronous and parallel execution design ensures scalability and performance for demanding research tasks.

Quick Start & Requirements

  • Installation: Clone the repository (git clone https://github.com/SPThole/CoexistAI.git) and run the quick_setup.sh script (for macOS/Linux) or bash quick_setup.sh. Windows users can utilize WSL or Git Bash.
  • Prerequisites: Docker (daemon running), Python 3, pip.
  • Configuration: Requires editing model_config.py for LLM/embedding settings and potentially quick_setup.sh to set API keys (e.g., GOOGLE_API_KEY). SearxNG setup involves Docker commands or using the provided Dockerfile.
  • Resources: Setup involves pulling Docker images, installing dependencies, and starting servers. Estimated setup time is minimal if prerequisites are met, but requires user configuration of API keys and model choices.
  • Docs: Usage examples are available in the provided notebook and API documentation (Swagger UI at /docs).

Highlighted Details

  • Vibe Podcasting & Speech-to-Text: Converts text content into podcast episodes or high-quality audio.
  • Multi-Source Exploration: Integrates web search (via SearxNG), advanced Reddit search (BM25 ranked), YouTube summarization/QA, map/route generation, and GitHub/local codebase exploration.
  • MCP Support: Fully compatible with LM Studio and other MCP hosts for local LLM integration.
  • Pluggable Components: Supports various LLMs, embedders, and includes cross-encoder reranking for improved search result quality.
  • Async & Parallel Execution: Built for speed and scalability.

Maintenance & Community

The README does not provide specific details regarding maintainers, sponsorships, or community channels (like Discord/Slack). Contributions are welcomed via pull requests and issues on GitHub.

Licensing & Compatibility

The project is licensed under a custom Non-Commercial Research and Educational Use License. Commercial or production use is strictly prohibited. Compatibility is primarily for research, prototyping, and educational purposes.

Limitations & Caveats

The most significant limitation is the strict prohibition of commercial or production use. The project relies on public web scraping and does not use the official Reddit API, necessitating adherence to robots.txt and terms of service. OpenStreetMap API usage may be subject to rate limits.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
2
Star History
75 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.