Discover and explore top open-source AI tools and projects—updated daily.
Framework for LLM-based multi-agent reinforced training and inference
Top 89.4% on SourcePulse
Summary
MARTI is an open-source framework for training LLM-based Multi-Agent Systems (MAS) with Reinforcement Learning (RL). It addresses scalability and context tracking limitations of single-agent LLMs by enabling structured, collaborative agent behavior via RL. The framework targets researchers and developers aiming to advance LLM reasoning capabilities and foster collective intelligence.
How It Works
MARTI employs centralized multi-agent interaction with distributed policy training, managing interactions and rewards centrally while training policies distributively. Its core modules are the Multi-Agent World, Centralized Rewarding, and Single Agent Trainer. This approach facilitates scalable, adaptive workflows, combining LLM power with MAS robustness and RL learning for complex tasks.
Quick Start & Requirements
Installation requires cloning the repository (git clone https://github.com/TsinghuaC3I/MARTI.git
), navigating into the directory (cd MARTI
), and installing dependencies (pip install -r requirements.txt
). Key prerequisites include OpenRLHF, Ray, and vLLM. Training with three Qwen2.5-3B agents necessitates a minimum of approximately 6x80GB GPUs. Example scripts for inference and training are provided.
Highlighted Details
Maintenance & Community
Developed by Tsinghua University and Shanghai AI Lab, MARTI is in an early experimental stage with active development. The project welcomes collaborations to advance LLM-based multi-agent RL. Key contacts include Kaiyan Zhang and Biqing Qi.
Licensing & Compatibility
The provided README does not specify a software license. Users should verify terms for integration, especially for commercial use.
Limitations & Caveats
MARTI is in a very early experimental stage, suggesting potential instability. It aims to address known LLM agent system limitations like poor role adherence and communication, but these may persist. Experimental features include third-party integrations (AutoGen, CAMEL) and generative reward models.
3 days ago
Inactive