Yunjue-Agent  by YunjueTech

Agent system for open-ended tasks

Created 1 month ago
381 stars

Top 75.1% on SourcePulse

GitHubView on GitHub
Project Summary

Yunjue Agent addresses the limitations of static AI agents in open-ended, dynamic environments by introducing an "In-Situ Self-Evolving" paradigm. This approach targets researchers and developers building adaptable AI systems, offering the benefit of continuous capability expansion and adaptation through learning from experience streams without reliance on static toolsets or offline training. The system aims to create AI agents with "vitality," capable of autonomous growth, self-correction, and self-optimization as user needs and world information evolve.

How It Works

Yunjue Agent reframes sequential task interactions as continuous experience streams, distilling short-term execution feedback into long-term, reusable capabilities. It prioritizes "tool evolution" as the primary driver for capability expansion, leveraging verifiable binary feedback (execution success/failure) from tools. This "Tool-First" principle mitigates hallucination risks and ensures stable learning, enhanced by a "Parallel Batch Evolution" strategy for efficiency. The core approach bridges static capability with on-the-fly evolution via internal feedback loops, enabling real-time adaptation and exploration without additional supervision signals.

Quick Start & Requirements

  • Prerequisites: Python 3.12 or higher, uv package manager, and Linux operating system.
  • Setup: Clone the repository, make install.sh executable, and run it. Users must then configure environment variables (e.g., OPENAI_API_KEY) in .env and settings in conf.yaml. Activate the virtual environment (source .venv/bin/activate) and run the evolution script, for example: ./scripts/evolve.sh --dataset DEEPSEARCHQA --run_name test --batch_size 1 --start 0.
  • Dependencies: Requires API keys for services like Codex and TAVILY, and configurable vision/summarization models.
  • Links: Official GitHub Repository, Technical Report.

Highlighted Details

  • In-situ Self-evolving Paradigm: Enables on-the-fly adaptation and exploration by reframing discrete interactions as continuous experience streams with internal feedback loops.
  • SOTA Performance from "Zero-Start": Achieves state-of-the-art results starting with an empty tool library, demonstrating significant gains over proprietary baselines (e.g., +17.4% on DeepSearchQA over Gemini 3 Pro) and securing 2nd place on the HLE leaderboard.
  • "Tool-First" Evolutionary Principle: Prioritizes tool evolution using objective binary feedback signals, ensuring stable accumulation of general primitives and mitigating hallucination risks.
  • Fully Reproducible & Open Traces: Offers a comprehensive suite including end-to-end code, benchmark scripts, versioned tool artifacts, and full interaction traces for auditable research.
  • Benchmark Performance: Evaluated on HLE, DeepSearchQA, FinSearchComp (T2&T3), xbench-ScienceQA, and xbench-DeepSearch, achieving SOTA results.

Maintenance & Community

The project is an initial release with ongoing code cleanup, and the team welcomes issues and pull requests. Direct community channels (e.g., Discord, Slack) are not specified, but contact information for inquiries and team recruitment is provided. The repository shows recent activity with multiple commits in the past week.

Licensing & Compatibility

This project is licensed under the Apache License 2.0. This license generally permits commercial use and integration into closed-source projects.

Limitations & Caveats

As an initial release refactored from research experiments, the codebase may contain minor bugs or edge cases during reproduction. Continuous code refinement is underway.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
4
Star History
342 stars in the last 30 days

Explore Similar Projects

Starred by Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
3 more.

Trace by microsoft

0%
708
AutoDiff-like tool for end-to-end AI agent training with general feedback
Created 1 year ago
Updated 2 months ago
Starred by Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
1 more.

swe-rl by facebookresearch

0.1%
677
RL for software evolution
Created 1 year ago
Updated 11 months ago
Feedback? Help us improve.