Automated failure attribution for LLM multi-agent systems
Top 91.4% on sourcepulse
This repository provides an implementation for automated failure attribution in LLM-based multi-agent systems, addressing the challenge of identifying which agent and at which step a task failed. It is targeted at researchers and developers working with complex agentic systems, offering a benchmark and dataset to reduce manual debugging and accelerate development cycles.
How It Works
The project introduces automated failure attribution methods to pinpoint the root cause of failures in multi-agent systems. It supports various judging strategies, including "all-at-once," "step-by-step," and "binary search," to analyze task execution logs and identify the responsible agent and error step. This approach aims to provide fine-grained insights for debugging and agent self-improvement.
Quick Start & Requirements
pip install -r requirements.txt
python inference.py --method <METHOD> --model <MODEL> --is_handcrafted <DATA> --directory_path <PATH>
python evaluate.py --data_path <DATA_PATH> --eval_file <EVAL_FILE>
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 weeks ago
Inactive