Discover and explore top open-source AI tools and projects—updated daily.
S1s-ZAutonomous skill discovery for LLM context learning
Top 91.8% on SourcePulse
Ctx2Skill is a self-evolving framework designed to enhance language models' ability to learn from complex, out-of-distribution contexts. It autonomously discovers, refines, and selects context-specific, natural-language skills without requiring human annotation or external feedback. This framework addresses the prohibitive cost of manual skill annotation for dense technical documents and the lack of feedback in automated skill construction, enabling LLMs to improve their context learning capabilities at inference time.
How It Works
The core of Ctx2Skill is a multi-agent self-play loop involving five distinct, frozen LM agent roles: Challenger, Reasoner, Judge, Proposer, and Generator. This adversarial loop allows the Challenger to generate probing tasks and rubrics, while the Reasoner attempts to solve them using evolving skill sets. The Judge provides verdicts, and the Proposer/Generator agents synthesize and materialize skill updates based on success and failure patterns. To prevent adversarial collapse and ensure generalization, a Cross-Time Replay mechanism collects representative tasks and re-evaluates historical skill sets, selecting the one that maximizes performance across both hard and easy probes.
Quick Start & Requirements
git clone https://github.com/S1s-Z/Ctx2Skill.git) and navigate into the directory.CL-bench-context-dedup.jsonl, CL-bench-with-task-delimiter.jsonl) in the project root. Evaluation logs and responses are also available.OPENAI_BASE_URL, OPENAI_API_KEY) and run python selfplay_loop.py with specified model configurations and parameters.python infer.py with discovered skills for augmentation.python eval_ignore_none.py to assess performance.Highlighted Details
Maintenance & Community
No specific details regarding maintainers, community channels (e.g., Discord, Slack), or roadmaps are provided in the README.
Licensing & Compatibility
This project is released under the MIT License, which is permissive for commercial use and integration into closed-source projects.
Limitations & Caveats
The README suggests that GPT-5.2 yields the most consistent results during early experiments, implying potential variability in performance with other models or configurations. The framework's reliance on OpenAI-compatible APIs introduces an external dependency and potential cost factor.
1 day ago
Inactive