Discover and explore top open-source AI tools and projects—updated daily.
atomicmemoryKnowledge compiler for persistent, interlinked wikis from raw sources
New!
Top 90.6% on SourcePulse
This project compiles raw text sources into an interlinked Markdown wiki, inspired by the LLM Wiki pattern. It addresses the issue of knowledge being lost or re-discovered at query time by creating a persistent, browsable artifact that compounds over time. Aimed at AI researchers, engineers building knowledge bases, and technical writers, it offers a way to build a structured, evolving knowledge base that complements traditional RAG approaches.
How It Works
The system employs a two-phase pipeline. Phase 1 ingests sources, performs SHA-256 hash checks for incremental updates, and uses an LLM for concept extraction. Phase 2 generates wiki pages, resolves [[wikilink]]s, and creates an index.md. This approach eliminates order dependence, catches failures early, merges shared concepts, and ensures only changed sources are re-processed by the LLM. Queries can be saved (--save), with their answers becoming new wiki pages that enrich the knowledge base for future queries.
Quick Start & Requirements
npm install -g llm-wiki-compilerexport ANTHROPIC_API_KEY=sk-...).llmwiki ingest <url|file>, llmwiki compile, llmwiki query "question" [--save].examples/basic/ in the repository for pre-generated output.Highlighted Details
--save flag are added as new wiki pages, enhancing the knowledge base for subsequent queries.[[wikilinks]] that resolve to concept titles, making the output compatible with Obsidian.^[filename.md] markers to link generated content back to its original sources.Maintenance & Community
The project welcomes issues and PRs. A roadmap includes planned enhancements such as improved provenance, linting, multi-provider LLM support, semantic search, and agent integration. No specific community channels (e.g., Discord, Slack) are listed.
Licensing & Compatibility
The project is licensed under the MIT license, which is generally permissive for commercial use and integration into closed-source projects.
Limitations & Caveats
This is early-stage software, best suited for small, high-signal corpora (up to a few dozen sources). It currently supports only Anthropic models. Sources exceeding token limits are truncated during ingest, with indicators provided in the frontmatter. Image support and Marp slide generation are not yet implemented.
3 days ago
Inactive