opro by google-deepmind

Research paper code for LLMs as optimizers

Created 2 years ago

679 stars

Top 50.0% on SourcePulse

View on GitHub

1 Expert Loves This Project

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Project Summary

This repository provides the official code for the paper "Large Language Models as Optimizers," enabling users to leverage LLMs for prompt optimization and evaluation. It targets researchers and developers working with LLMs who need to enhance prompt performance for various tasks like instruction tuning and solving complex problems. The primary benefit is a framework for systematically improving LLM prompts using other LLMs as optimizers.

How It Works

The project frames prompt optimization as a search problem, where LLMs iteratively refine prompts to maximize a scoring metric. It supports optimizing instructions for tasks like GSM8K and solving problems like the Traveling Salesman Problem. The approach allows plugging in custom models by adhering to the prompt_utils.py interface, offering flexibility beyond supported models like text-bison and GPT.

Quick Start & Requirements

Install: Verified to work with Python 3.10.13.
Dependencies: absl-py, google.generativeai, immutabledict, openai.
API Keys: Requires PALM_API_KEY and OPENAI_API_KEY for supported models.
Usage:
- Prompt optimization: python optimize_instructions.py --optimizer="gpt-3.5-turbo" --scorer="text-bison" --dataset="gsm8k" --task="train" --palm_api_key="<your_palm_api_key>" --openai_api_key="<your_openai_api_key>"
- Prompt evaluation: python evaluate_instructions.py --scorer="text-bison" --dataset="gsm8k" --task="test" --evaluate_test_fold=true --palm_api_key="<your_palm_api_key>"
Resources: API calls to PaLM or GPT can incur significant costs; users are advised to estimate costs or use self-served models.

Highlighted Details

Supports prompt optimization for instruction datasets (e.g., GSM8K) and combinatorial problems (e.g., TSP).
Provides a framework for integrating custom LLM optimizers and scorers.
Includes scripts for both optimizing and evaluating prompt performance.
Offers flexibility in choosing optimization strategies and datasets.

Maintenance & Community

This project is associated with Google DeepMind. It is explicitly stated as "not an officially supported Google product." No community links or roadmap information are provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Given its origin and the lack of a LICENSE file, users should assume it is proprietary or subject to Google's terms of service unless otherwise specified. Commercial use and linking with closed-source projects are not addressed.

Limitations & Caveats

The primary caveat is the potential for high API costs when using external LLMs like GPT or PaLM for optimization and evaluation. The project is also not an officially supported Google product, which may imply limited long-term maintenance or support.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

12 stars in the last 30 days