code-act  by xingyaoww

Research paper on executable code actions for LLM agents

created 1 year ago
1,312 stars

Top 31.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official implementation for "Executable Code Actions Elicit Better LLM Agents," introducing CodeAct, a unified action space for LLM agents that leverages executable Python code. It aims to improve agent performance by allowing dynamic revision of actions based on execution results, targeting researchers and developers building sophisticated LLM-powered agents.

How It Works

CodeAct integrates a Python interpreter to execute code actions, enabling multi-turn interactions where agents can dynamically revise or emit new actions based on observations like code execution outcomes. This approach consolidates LLM actions into a unified, executable space, outperforming traditional Text and JSON action formats in benchmarks like M3 ToolEval.

Quick Start & Requirements

  • LLM Serving (vLLM): Requires NVIDIA Docker. Clone the Mistral-7b model and run ./scripts/chat/start_vllm.sh.
  • LLM Serving (llama.cpp): Tested on macOS. Requires llama.cpp build and optionally converting models to GGUF format. Run ./server -m <model_path>.
  • Code Execution Engine: Requires Docker. Run ./scripts/chat/code_execution/start_jupyter_server.sh 8081.
  • Interaction: Use scripts/chat/demo.py for CLI or configure chat-ui for a web interface.
  • Dependencies: Python, Docker, NVIDIA drivers (for vLLM), git lfs.
  • Resources: Requires significant VRAM for LLM serving.

Highlighted Details

  • Outperforms Text/JSON action formats by up to 20% on M3 ToolEval.
  • Provides a 7k multi-turn interaction dataset, CodeActInstruct.
  • Offers CodeActAgent-Mistral-7b (32k context) and CodeActAgent-Llama-7b (4k context) models.
  • Supports Kubernetes deployment for all components.

Maintenance & Community

The project is associated with ICML 2024. Links to Hugging Face for data and models are provided. No specific community channels (Discord/Slack) are mentioned in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The models are hosted on Hugging Face, which typically uses specific licenses per model. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The primary LLM serving method (vLLM) requires NVIDIA GPUs and Docker. While llama.cpp support is available for laptop inference, it involves model conversion and compilation steps. The project appears to be released alongside a research paper, and its long-term maintenance status is not detailed.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
199 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.