code-act by xingyaoww

Research paper on executable code actions for LLM agents

Created 2 years ago

1,533 stars

Top 26.8% on SourcePulse

View on GitHub

5 Experts Love This Project

Travis Fischer

Founder of Agentic

Vincent Weisser

Cofounder of Prime Intellect

Jeff Hammerbacher

Cofounder of Cloudera

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

and 1 more!

Project Summary

This repository provides the official implementation for "Executable Code Actions Elicit Better LLM Agents," introducing CodeAct, a unified action space for LLM agents that leverages executable Python code. It aims to improve agent performance by allowing dynamic revision of actions based on execution results, targeting researchers and developers building sophisticated LLM-powered agents.

How It Works

CodeAct integrates a Python interpreter to execute code actions, enabling multi-turn interactions where agents can dynamically revise or emit new actions based on observations like code execution outcomes. This approach consolidates LLM actions into a unified, executable space, outperforming traditional Text and JSON action formats in benchmarks like M3 ToolEval.

Quick Start & Requirements

LLM Serving (vLLM): Requires NVIDIA Docker. Clone the Mistral-7b model and run ./scripts/chat/start_vllm.sh.
LLM Serving (llama.cpp): Tested on macOS. Requires llama.cpp build and optionally converting models to GGUF format. Run ./server -m <model_path>.
Code Execution Engine: Requires Docker. Run ./scripts/chat/code_execution/start_jupyter_server.sh 8081.
Interaction: Use scripts/chat/demo.py for CLI or configure chat-ui for a web interface.
Dependencies: Python, Docker, NVIDIA drivers (for vLLM), git lfs.
Resources: Requires significant VRAM for LLM serving.

Highlighted Details

Outperforms Text/JSON action formats by up to 20% on M3 ToolEval.
Provides a 7k multi-turn interaction dataset, CodeActInstruct.
Offers CodeActAgent-Mistral-7b (32k context) and CodeActAgent-Llama-7b (4k context) models.
Supports Kubernetes deployment for all components.

Maintenance & Community

The project is associated with ICML 2024. Links to Hugging Face for data and models are provided. No specific community channels (Discord/Slack) are mentioned in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The models are hosted on Hugging Face, which typically uses specific licenses per model. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The primary LLM serving method (vLLM) requires NVIDIA GPUs and Docker. While llama.cpp support is available for laptop inference, it involves model conversion and compilation steps. The project appears to be released alongside a research paper, and its long-term maintenance status is not detailed.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

51 stars in the last 30 days