FireAct by anchen1011

Language agent fine-tuning research paper

Created 2 years ago

288 stars

Top 91.3% on SourcePulse

View on GitHub

2 Experts Love This Project

Wing Lian

Founder of Axolotl AI

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

This repository provides the code, data, and prompts for FireAct, a framework for fine-tuning language agents. It enables agents to interact with tools and execute tasks, targeting researchers and developers working on agent-based AI systems.

How It Works

FireAct leverages a ReAct-style approach, defining tools and tasks within dedicated directories. Data generation and experimentation are driven by generation.py, which orchestrates agent interactions with tools and models. This allows for systematic collection of trajectories for fine-tuning, aiming to improve agent performance on complex tasks.

Quick Start & Requirements

Install: Clone the repository and install dependencies via pip install -r requirements.txt.
Prerequisites: OpenAI API key (as OPENAI_API_KEY), SERP API key (as SERPAPI_API_KEY), Python 3.9+.
Setup: Create a Conda environment (conda create -n fireact python=3.9, conda activate fireact).
Demo Data Generation: python generation.py --task hotpotqa --backend gpt-4 --promptpath default --evaluate --random --task_split val --temperature 0 --task_end_index 5 (Note: High --task_end_index values are costly).
Fine-tuning: Example commands provided for Llama LoRA fine-tuning in finetune/llama_lora/.
Inference: Example commands for both Llama and GPT models are available.
Details: Official Docs

Highlighted Details

Offers pre-trained models based on Llama and CodeLlama families, fine-tuned with LoRA.
Provides data in Alpaca and GPT formats for supervised fine-tuning.
Codebase builds upon established projects like ReAct, Stanford Alpaca, and Alpaca-LoRA.
Includes examples for both data generation and supervised fine-tuning.

Maintenance & Community

The project is associated with the publication "FireAct: Toward Language Agent Fine-tuning".
No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The underlying projects (ReAct, Stanford Alpaca, Alpaca-LoRA) have varying licenses, which may impose restrictions.
Commercial use compatibility is not specified.

Limitations & Caveats

Data generation using GPT-4 and SERP API can be costly. The project does not explicitly state its license, which may impact commercial use. Community support channels are not readily available.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days