MetaClaw by aiming-lab

Agent learning and evolution through conversation

Created 3 months ago

3,430 stars

Top 13.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Philipp Moritz

Cofounder of Anyscale

Project Summary

Summary

MetaClaw addresses the challenge of continuous agent learning and skill evolution by automating the process of turning live conversations into training data. It targets engineers and researchers seeking to enhance LLM agent capabilities without requiring dedicated GPU infrastructure. The primary benefit is simplified, automated agent evolution and skill integration.

How It Works

MetaClaw operates via an OpenAI-compatible proxy that intercepts LLM interactions. At each turn, it injects relevant skills into the agent's system prompt for immediate behavioral improvement. In skills_only mode, conversations are automatically summarized into new skills post-session. For advanced learning, the rl mode leverages Tinker Cloud RL for continuous fine-tuning using implicit feedback signals, while On-Policy Distillation (OPD) allows distilling knowledge from a teacher model. This decoupled, asynchronous architecture enables seamless weight updates without interrupting service.

Quick Start & Requirements

Installation is straightforward: pip install -e . for core functionality, or pip install -e ".[rl]" for RL training support. A one-time metaclaw setup wizard configures LLM providers (Kimi, Qwen, OpenAI, custom), API keys, and RL options. Running metaclaw start launches the proxy and integrates OpenClaw. The skills_only mode requires only a network connection; RL training offloads to Tinker Cloud. Prerequisites include Python and an OpenAI-compatible LLM API endpoint.

Highlighted Details

One-Click Deployment: Simplified setup and start commands (metaclaw setup, metaclaw start).
Dynamic Skill Injection: Relevant skills are retrieved and injected into system prompts at each interaction turn.
Automatic Skill Evolution: Conversations are analyzed to automatically distill new skills without manual curation.
No GPU Requirement: skills_only mode operates without local GPU hardware; RL training is cloud-based.
Dual Learning Modes: Supports Reinforcement Learning (RL) via GRPO and On-Policy Distillation (OPD) for teacher-student model training.

Maintenance & Community

The provided README does not detail specific contributors, community channels (e.g., Discord, Slack), or a public roadmap.

Licensing & Compatibility

This project is licensed under the permissive MIT License, allowing for broad compatibility with commercial use and closed-source applications. It integrates with any OpenAI-compatible LLM API.

Limitations & Caveats

Advanced features like RL training and OPD require specific configurations, including API keys for external services like Tinker Cloud and access to a teacher model endpoint for OPD. The project appears relatively new, with recent updates in March 2026.

Health Check

Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

82 stars in the last 30 days