synthadoc  by axoviq-ai

LLM engine for structured, local-first wikis

Created 1 month ago
294 stars

Top 89.7% on SourcePulse

GitHubView on GitHub
Project Summary

Synthadoc is an open-source LLM knowledge compilation engine that transforms raw documents into structured, local-first wikis. It provides a transparent, human-readable alternative to traditional RAG, enabling self-managed and self-improved knowledge bases. It targets individuals, small teams, and enterprises seeking accurate, scalable, and locally-controlled knowledge management.

How It Works

Synthadoc synthesizes knowledge at ingest time, creating a persistent wiki graph rather than retrieving information on demand. It processes diverse document types using an LLM to build automatically cross-referenced wikis, detect contradictions, and cite sources. The output is plain Markdown, ensuring local-first storage, no vendor lock-in, and seamless integration with tools like Obsidian, prioritizing durable knowledge artifacts over ephemeral query-time synthesis.

Quick Start & Requirements

  • Installation: Clone the repository, install Python dependencies (pip3 install -e ".[dev]"), and build the Obsidian plugin (npm install, npm run build).
  • Prerequisites: Python 3.11+, Node.js 18+ (for Obsidian plugin), Git. An LLM API key (e.g., Gemini Flash, Groq, Ollama) is required, unless using Claude Code or Opencode. A Tavily API key is optional for web search.
  • Documentation: Key resources include docs/user-quick-start-guide.md and docs/design.md.
  • Setup: Configuration involves setting API keys via environment variables or config files and starting the server. A demo wiki is available for immediate exploration without an LLM API key.

Highlighted Details

  • Ingest-Time Synthesis: Compiles knowledge into a persistent wiki graph, actively catching contradictions instead of blending them.
  • Local-First & Open Format: Outputs are plain Markdown files, ensuring data ownership, no vendor lock-in, and compatibility with standard editors.
  • Contradiction Detection: Surfaces disagreements between sources, flagging pages for review or auto-resolution.
  • Autonomous Self-Optimization: Features include automatic cross-linking, orphan page detection, and scaffold regeneration for wiki accuracy.
  • Extensibility: Supports custom skills via plug-ins and hooks for CI/CD integration.
  • Provider Flexibility: Integrates with numerous LLM providers (free/paid) and local models, including coding tool providers without API keys.

Maintenance & Community

The provided README does not detail specific community channels (e.g., Discord, Slack) or notable contributors beyond the repository owner. A CONTRIBUTING.md file suggests a framework for community involvement.

Licensing & Compatibility

The overall project license is not explicitly stated. While components like BaseSkill and LLMProvider are Apache-2.0 licensed, the top-level license requires clarification for commercial use or integration into closed-source projects.

Limitations & Caveats

The project is marked as "Document version: v0.4.0 (in progress)", indicating active development and potential for ongoing changes. The lack of a clearly defined top-level project license is a significant caveat for adoption. Functionality relies on obtaining and configuring at least one LLM API key, unless specific coding tool providers are utilized.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
49
Issues (30d)
1
Star History
279 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.