agent-lightning by microsoft

Train any AI agent with rollouts and feedback

Created 6 months ago

10,208 stars

Top 5.0% on SourcePulse

View on GitHub

4 Experts Love This Project

Coauthor of AutoGen; Research Scientist at Microsoft Research

Will Brown

Research Lead at Prime Intellect

Project Summary

Agent Lightning is a server-client framework designed to train AI agents using reinforcement learning and automatic prompt optimization, enabling zero-code-change optimization for agents built with various frameworks like LangChain, AutoGen, or even custom Python code. It targets developers and researchers looking to enhance agent performance through iterative feedback loops and advanced training methodologies.

How It Works

The framework operates with a central training server that manages data, prepares samples, and provides an LLM endpoint. Agent clients retrieve these samples, process them (potentially interacting with LLMs), and return trajectories (sequences of prompts and responses). The server then uses these trajectories to compute losses and optimize the underlying language models, facilitating a continuous learning cycle.

Quick Start & Requirements

Installation: pip install agentlightning
Core Training Dependencies: PyTorch (2.7.0+cu128), FlashAttention, vLLM (0.9.2), VERL (0.5.0). A setup script scripts/setup_stable_gpu.sh is available.
Agent Frameworks: Optional installations for AutoGen, LiteLLM, MCP, UV, OpenAI Agents, LangChain, and SQL dependencies are provided.
Environment: Python 3.10+ recommended. Requires a GPU with CUDA 12.8 for core training dependencies.
Resources: Setup involves installing multiple large dependencies; estimated setup time is moderate.

Highlighted Details

Supports training of any AI agent with minimal code modifications.
Integrates with popular agent frameworks like LangChain, AutoGen, and OpenAI Agent SDK.
Implements reinforcement learning and automatic prompt optimization techniques.
Offers example use cases for calculator tool use (Calc-X) and SQL execution (Spider).

Maintenance & Community

This is a Microsoft Research project. Contribution guidelines and a Code of Conduct are provided.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive MIT license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

Agent Lightning uses AgentOps for tracking by default, requiring explicit disabling if already in use. Debugging traces is experimental. Server and agent clients must run in separate processes. VERL with vLLM may encounter out-of-memory issues, necessitating frequent checkpointing and potential training resumption.

Health Check

Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

658 stars in the last 30 days