llm-wiki-skill by sdyckjq-lab

Agent skill for continuous knowledge base construction

Created 3 months ago

2,108 stars

Top 20.5% on SourcePulse

Project Summary

This project provides a unified system for building personal knowledge bases, leveraging Andrej Karpathy's llm-wiki methodology. It transforms fragmented information into a continuously maintained, interconnected wiki, designed for agents like Claude Code, Codex, and OpenClaw. The primary benefit is enabling agents to construct and manage a persistent knowledge repository, compiled once and updated over time, rather than re-processing raw documents for each query.

How It Works

The skill integrates with various AI agents, allowing them to automatically process and organize diverse information sources into a structured wiki. Its core differentiator lies in a "compile once, maintain continuously" approach. Information is ingested, intelligently routed based on source type (web pages, PDFs, text, social media), and processed using specialized extraction tools. The output is a structured knowledge base featuring entities, topics, and bidirectional links, stored locally as Markdown files compatible with tools like Obsidian.

Quick Start & Requirements

Primary Install/Run: The recommended installation method is to provide the repository URL directly to your agent. Alternatively, clone the repository and execute install.sh --platform <platform> (e.g., claude, codex, openclaw). A legacy setup.sh script is available for older Claude installations.
Prerequisites: The agent must be capable of executing shell commands. For automatic web content extraction, Chrome needs to be launched in debug mode. Extraction of WeChat articles or YouTube subtitles requires uv or npm to be installed.
Default Install Location: Skills are typically installed in ~/.claude/skills/llm-wiki, ~/.codex/skills/llm-wiki, or ~/.openclaw/skills/llm-wiki.
Links: Platform-specific documentation: Claude Code (platforms/claude/CLAUDE.md), Codex (platforms/codex/AGENTS.md), OpenClaw (platforms/openclaw/README.md).

Highlighted Details

Zero-configuration initialization creates knowledge bases with auto-generated structures and templates.
Intelligent material routing automatically selects the optimal extraction method based on URL domain.
Content is processed in tiers: long articles are fully organized, while short content is simplified to avoid redundancy.
Supports batch processing of all files within a specified folder.
Generates structured wiki content, including entity pages, topic pages, and summaries, interconnected via [[bidirectional links]].
Includes a knowledge base health check to detect isolated pages, broken links, and contradictory information.
Outputs are fully compatible with Obsidian, using local Markdown files.

Maintenance & Community

This project integrates and reuses several open-source components, including baoyu-url-to-markdown, youtube-transcript, and wechat-article-to-markdown. No specific details regarding core maintainers, sponsorships, or community channels (like Discord/Slack) are provided in the README.

Licensing & Compatibility

The project is released under the MIT license. This permissive license generally allows for commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

Automatic extraction for certain platforms like X/Twitter and WeChat may fail, necessitating the use of Chrome in debug mode or falling back to manual content pasting. Some sources, such as Xiaohongshu, are currently only supported via manual pasting. The functionality is dependent on the agent's ability to execute shell commands.

Health Check

Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

266 stars in the last 30 days