FireAct  by anchen1011

Language agent fine-tuning research paper

created 1 year ago
280 stars

Top 93.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the code, data, and prompts for FireAct, a framework for fine-tuning language agents. It enables agents to interact with tools and execute tasks, targeting researchers and developers working on agent-based AI systems.

How It Works

FireAct leverages a ReAct-style approach, defining tools and tasks within dedicated directories. Data generation and experimentation are driven by generation.py, which orchestrates agent interactions with tools and models. This allows for systematic collection of trajectories for fine-tuning, aiming to improve agent performance on complex tasks.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies via pip install -r requirements.txt.
  • Prerequisites: OpenAI API key (as OPENAI_API_KEY), SERP API key (as SERPAPI_API_KEY), Python 3.9+.
  • Setup: Create a Conda environment (conda create -n fireact python=3.9, conda activate fireact).
  • Demo Data Generation: python generation.py --task hotpotqa --backend gpt-4 --promptpath default --evaluate --random --task_split val --temperature 0 --task_end_index 5 (Note: High --task_end_index values are costly).
  • Fine-tuning: Example commands provided for Llama LoRA fine-tuning in finetune/llama_lora/.
  • Inference: Example commands for both Llama and GPT models are available.
  • Details: Official Docs

Highlighted Details

  • Offers pre-trained models based on Llama and CodeLlama families, fine-tuned with LoRA.
  • Provides data in Alpaca and GPT formats for supervised fine-tuning.
  • Codebase builds upon established projects like ReAct, Stanford Alpaca, and Alpaca-LoRA.
  • Includes examples for both data generation and supervised fine-tuning.

Maintenance & Community

  • The project is associated with the publication "FireAct: Toward Language Agent Fine-tuning".
  • No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • The repository does not explicitly state a license. The underlying projects (ReAct, Stanford Alpaca, Alpaca-LoRA) have varying licenses, which may impose restrictions.
  • Commercial use compatibility is not specified.

Limitations & Caveats

Data generation using GPT-4 and SERP API can be costly. The project does not explicitly state its license, which may impact commercial use. Community support channels are not readily available.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.