ReAct by ysymyth

GPT-3 prompting code for ReAct research paper

Created 3 years ago

3,401 stars

Top 14.2% on SourcePulse

View on GitHub

7 Experts Love This Project

Vincent Weisser

Cofounder of Prime Intellect

Elvis Saravia

Founder of DAIR.AI

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Calvin French-Owen

Cofounder of Segment

and 3 more!

Project Summary

This repository provides code for ReAct prompting, a method that synergizes reasoning and acting in language models. It is targeted at researchers and practitioners interested in improving LLM performance on complex tasks requiring multi-step decision-making and interaction with external tools or environments. The primary benefit is enhanced task completion through a more robust reasoning process.

How It Works

ReAct combines the strengths of Chain-of-Thought (CoT) prompting for reasoning and standard prompting for acting. It allows language models to generate intermediate reasoning traces (like CoT) and then take actions based on those traces, observing the results, and iterating. This approach enables models to interact with environments, search for information, or use tools, leading to more grounded and effective decision-making.

Quick Start & Requirements

Install the openai package.
Install alfworld following its specific instructions.
Set the OPENAI_API_KEY environment variable.
Run experiments via .ipynb notebooks (e.g., hotpotqa.ipynb).

Highlighted Details

Implements ReAct prompting for tasks like HotpotQA, FEVER, AlfWorld, and WebShop.
Benchmarks show GPT-3 (davinci-002) outperforming PaLM-540B on AlfWorld and HotpotQA (with a smaller sample size).
Paper is published at ICLR 2023.
Offers a link to the arXiv paper for detailed methodology.

Maintenance & Community

The project is associated with the ICLR 2023 paper "ReAct: Synergizing Reasoning and Acting in Language Models" by Yao et al. Further development and broader adoption are suggested via LangChain's zero-shot ReAct Agent.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The README notes that experiments on HotpotQA and FEVER use only 500 random validation examples due to dataset size. Performance may vary with different model versions or full dataset evaluation.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

132 stars in the last 30 days