Qwen-AgentWorld  by QwenLM

Language world models for general agents

Created 4 days ago

New!

507 stars

Top 60.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Qwen-AgentWorld: Language World Models for General Agents

Qwen-AgentWorld introduces native language world models (LWMs) for general agents, simulating complex environments across seven unified domains. It offers a generalizable, scalable, and controllable simulator, benefiting researchers and developers by enabling robust agent training and evaluation with zero-shot out-of-distribution (OOD) generalization capabilities.

How It Works

This project pioneers a "native world model" approach, integrating environment modeling from the initial CPT stage through SFT and RL training pipelines, rather than as a post-hoc addition. This core design allows for superior zero-shot generalization to unseen environments and controllable simulation. The model unifies seven distinct agent interaction domains—MCP, Search, Terminal, SWE, Android, Web, and OS—into a single, cohesive architecture trained on over 10 million real-world interaction trajectories.

Quick Start & Requirements

  • Primary install/run: Supports SGLang (python -m sglang.launch_server ...), vLLM (vllm serve ...), and Hugging Face Transformers (AutoModelForCausalLM.from_pretrained).
  • Prerequisites: GPU acceleration is implied for inference. pip install openai is required for evaluation. An OpenAI API key is needed for LLM judge scoring.
  • Resources: Long context lengths (256K) are supported.
  • Links: Technical Report, Blog, Hugging Face, ModelScope.

Highlighted Details

  • Qwen-AgentWorld-397B-A17B achieves the highest overall score (58.71) on the AgentWorldBench, outperforming proprietary models like GPT-5.4 (58.25).
  • Demonstrates significant performance gains (+4.3 to +12.3) in out-of-distribution environments and controllable simulation tasks through Sim RL.
  • Functions as an "Agent Foundation Model," where LWM RL warm-up effectively transfers to multi-turn, tool-calling agentic tasks across diverse benchmarks.

Maintenance & Community

Community interaction and support are available via Discord and WeChat groups, with links not directly provided in the README.

Licensing & Compatibility

Models and AgentWorldBench are licensed under Apache 2.0, permitting commercial use and integration without explicit copyleft restrictions.

Limitations & Caveats

The provided documentation does not explicitly detail known limitations, alpha status, or specific unsupported platforms or features.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
509 stars in the last 4 days

Explore Similar Projects

Feedback? Help us improve.