X-Master  by sjtu-sai-agents

Tool-augmented reasoning agent for complex problem-solving

Created 2 months ago
262 stars

Top 97.3% on SourcePulse

GitHubView on GitHub
Project Summary

X-Master is a general-purpose, tool-augmented reasoning agent designed for complex problem-solving, particularly in scientific domains. It targets researchers and developers seeking to leverage AI agents that can fluidly interact with external tools and environments, enhancing their reasoning capabilities through a novel workflow. The primary benefit is an agent that emulates human research processes by strategically combining internal thought with external tool execution.

How It Works

X-Master operates by using Python code as its primary interaction language, enabling seamless communication with various environments, including Python libraries, custom tools, and self-generated code. Its core innovation lies in a "scattered-and-stacked" workflow, which strategically balances exploration breadth with reasoning depth to improve problem-solving performance. This approach allows the agent to pivot dynamically between internal deliberation and external tool utilization, mimicking a human researcher's iterative process.

Quick Start & Requirements

  • Installation:
    1. Create and activate a Conda environment: conda create -n xmaster python=3.10 and conda activate xmaster.
    2. Install dependencies: pip install -r requirements.txt.
    3. Navigate to src directory and install package: cd src then pip install -e..
  • Prerequisites: Python 3.10, Conda. Requires local deployment of DeepSeek-R1-0528 model and a code execution server using MCP Tools. For Humanity's Last Exam (HLE) evaluation, the o3-mini API must be configured.
  • Resources: Setup involves environment configuration, model deployment, and tool setup, with specific time/resource estimates not provided.
  • Links:

Highlighted Details

  • Environment Interaction: Emulates human researchers by fluidly pivoting between internal reasoning and external tool use.
  • Code as Interaction Language: Utilizes precise Python code snippets for communication and environment interaction.
  • Workflow: Employs a "scattered-and-stacked" approach to enhance problem-solving by increasing exploration breadth and reasoning depth.
  • HLE Benchmark: Includes specific scripts for generating and evaluating solutions on the Humanity's Last Exam benchmark.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), sponsorships, or roadmaps are provided in the README excerpt.

Licensing & Compatibility

License information is not specified in the provided text.

Limitations & Caveats

The project is described as containing "Initial codes," suggesting it may be in an early stage of development. Configuration requires local deployment of specific models (DeepSeek-R1-0528) and potentially a separate code execution server, which may present setup challenges. Compatibility for commercial use or integration with closed-source systems is not detailed.

Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
41 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.