SkillRL by aiming-lab

Recursive skill-augmented reinforcement learning for evolving LLM agents

Created 5 months ago

884 stars

Top 39.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Wing Lian

Founder of Axolotl AI

Project Summary

SkillRL is a framework designed to enable LLM agents to learn high-level, reusable behavioral patterns from past experiences. It addresses the limitations of traditional memory-based methods by abstracting raw trajectories into a hierarchical skill library, offering a more efficient and effective way to improve agent policies. The target audience includes researchers and developers working on advanced reinforcement learning agents, with the benefit of enhanced reasoning utility and reduced memory footprint.

How It Works

SkillRL employs "Experience-based Skill Distillation" to transform successful trajectories into strategic patterns and failed ones into concise lessons. These are organized within a "Hierarchical SKILLBANK," differentiating between General Skills for broad guidance and Task-Specific Skills for category-level heuristics. A "Recursive Skill Evolution" mechanism allows the skill library to co-evolve with the agent's policy during reinforcement learning by analyzing validation failures, leading to improved context efficiency and enhanced reasoning utility.

Quick Start & Requirements

The codebase is currently being prepared for public release, with "Getting Started" instructions noted as "Coming Soon." No specific installation commands, prerequisites, or estimated setup times are available at this time.

Highlighted Details

Experience-based Skill Distillation: Transforms raw trajectories into reusable strategic patterns and failure lessons.
Hierarchical SKILLBANK: Organizes knowledge into General Skills and Task-Specific Skills for structured learning.
Recursive Skill Evolution: Dynamically evolves the skill library alongside the agent's policy by analyzing validation failures.
Context Efficiency: Achieves 10-20% token compression compared to raw trajectory storage, improving reasoning utility.

Maintenance & Community

No specific details regarding maintenance, community channels (like Discord/Slack), or notable contributors are provided in the README.

Licensing & Compatibility

The README does not specify a license type or any compatibility notes for commercial use or closed-source linking.

Limitations & Caveats

The primary limitation is that the codebase is not yet publicly released, and detailed setup instructions are pending. The framework's practical performance and stability in diverse real-world scenarios beyond those presented in the paper are yet to be evaluated by the community.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

56 stars in the last 30 days