darwin-skill  by alchaincyf

Agent Skill optimization via autonomous evolution

Created 2 weeks ago

New!

1,918 stars

Top 22.3% on SourcePulse

GitHubView on GitHub
Project Summary

Autonomous skill optimization for agent systems, inspired by Andrej Karpathy's autoresearch, alchaincyf/darwin-skill addresses the challenge of managing and improving a growing number of agent skills. It targets users of Claude Code and other skills.sh-compatible platforms, offering a system to automatically evaluate, refine, and retain only measurably improved skills, preventing degradation over time.

How It Works

This project adapts the autoresearch paradigm to skill optimization. It implements an autonomous loop where each SKILL.md file is treated as a program to be optimized. The system employs a dual evaluation approach: static analysis scores structural quality (60 points), while runtime tests assess actual performance (40 points), with performance being the most heavily weighted factor. A core "ratchet" mechanism ensures that only changes leading to a quantifiable improvement are kept; regressions are automatically reverted via Git, guaranteeing that the skill's score only increases. Scoring is performed by a separate sub-agent to mitigate bias.

Quick Start & Requirements

  • Installation: Use npx skills add alchaincyf/darwin-skill within a compatible agent environment. Alternatively, download the darwin-skill.zip archive, extract it, and place the SKILL.md file into ~/.claude/skills/darwin-skill/.
  • Prerequisites: Requires an agent environment supporting the SKILL.md format, such as Claude Code, Codex, OpenClaw, Trae, or CodeBuddy. No specific hardware or non-standard software dependencies are listed.

Highlighted Details

  • 8-Dimensional Evaluation System: Combines static structural analysis (60%) with runtime performance testing (40%), prioritizing practical effectiveness.
  • Ratchet Mechanism: Guarantees progress by automatically reverting any changes that do not result in a measurable score improvement.
  • Human-in-the-Loop Confirmation: The optimization process pauses at the end of each skill's cycle, requiring user confirmation before proceeding to the next skill.
  • Autoresearch Inspiration: Directly applies the core principle of retaining only verifiable improvements, mirroring Karpathy's autoresearch methodology.

Maintenance & Community

The project is maintained by Huashu (@AlchainHust), with links provided to personal websites and social media channels. A related project, alchaincyf/nuwa-skill, is mentioned for skill creation. No direct community channels like Discord or Slack are listed.

Licensing & Compatibility

The project is released under the MIT License, permitting broad use, modification, and distribution, including for commercial purposes and integration into closed-source applications. It is compatible with the skills.sh ecosystem.

Limitations & Caveats

The system is not fully autonomous due to the mandatory "human-in-the-loop" confirmation required between optimization phases for each skill. Its effectiveness is contingent on the quality and comprehensiveness of the provided test prompts (test-prompts.json) and the scoring agents.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
1,928 stars in the last 17 days

Explore Similar Projects

Feedback? Help us improve.