llm-wiki  by Pratiyush

LLM-powered knowledge base from coding sessions

Created 1 month ago
259 stars

Top 97.7% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

llm-wiki transforms dormant LLM session transcripts from tools like Claude Code, Codex CLI, Copilot, Cursor, and Gemini into a searchable, interlinked knowledge base. It provides both human-readable static websites and machine-readable exports for AI agents, enabling deeper utilization of past AI interactions.

How It Works

This project implements Andrej Karpathy's LLM Wiki pattern, converting raw session .jsonl files into a multi-layered wiki structure. It first ingests sessions into immutable markdown (raw/), then generates LLM-enhanced wiki pages (wiki/) for sources, entities, and concepts, interlinked via [[wikilinks]]. Finally, it compiles a static HTML site (site/) with global search and AI-consumable exports, leveraging Python's standard library for core functionality.

Quick Start & Requirements

  • Install: Run ./setup.sh (macOS/Linux) or setup.bat (Windows) for a one-time setup, or pip install -e . for basic installation.
  • Run: Execute ./build.sh && ./serve.sh to build the static site and start a local development server at http://127.0.0.1:8765.
  • Prerequisites: Primarily Python standard library. Optional dependencies for advanced features like graph generation ([graph]) or end-to-end testing ([e2e]). Syntax highlighting is loaded from a CDN.
  • Links: Live demo: pratiyush.github.io/llm-wiki. Full documentation: docs/index.md.

Highlighted Details

  • AI-Consumable Exports: Generates machine-readable formats including llms.txt, llms-full.txt, graph.jsonld, per-page .txt/.json siblings, sitemap.xml, rss.xml, and robots.txt.
  • Obsidian Integration: Seamlessly integrates with Obsidian vaults, supporting Dataview dashboards and Templater templates for enhanced local knowledge management.
  • Quality & Governance: Employs a 4-factor confidence scoring system, a 5-state content lifecycle, and 16 linting rules (including LLM-powered checks for contradictions and summary accuracy).
  • MCP Server: Includes a built-in MCP server allowing direct querying of the wiki from various AI clients.
  • Privacy-Focused: Features default redaction of sensitive data (API keys, tokens, usernames) and binds the local server to localhost.

Maintenance & Community

The project has a clear release history with detailed milestones and versioning, indicating active development. Contribution guidelines are available in CONTRIBUTING.md. No explicit community channels (e.g., Discord, Slack) are listed.

Licensing & Compatibility

The project is released under the permissive MIT license, allowing for commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

Several LLM adapters, including Cursor, Gemini CLI, Copilot Chat, and Copilot CLI, are marked as Beta and require verification against current session formats. The core functionality relies on specific .jsonl transcript formats from supported agents.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
117
Issues (30d)
211
Star History
109 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Dirk Englund Dirk Englund(MIT EECS Professor and Cofounder of Axiomatic AI), and
25 more.

firecrawl by firecrawl

1.9%
123k
API service for turning websites into LLM-ready data
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.