BPO by thu-coai

Prompt optimizer for aligning LLMs without training

Created 2 years ago

333 stars

Top 82.5% on SourcePulse

Project Summary

Black-Box Prompt Optimization (BPO) addresses the challenge of aligning Large Language Models (LLMs) with human preferences without requiring model retraining. It offers a novel approach for users seeking to improve LLM output quality and safety by optimizing prompts.

How It Works

BPO operates by treating prompt engineering as an optimization problem. It leverages a separate preference model to iteratively refine prompts based on pairwise comparisons of LLM responses. This black-box approach avoids direct model fine-tuning, making it applicable to proprietary LLMs and reducing computational overhead.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Requires Python and PyTorch. GPU with CUDA is recommended for efficient inference and training.
Official demo available on Hugging Face.
Inference example provided in the README.

Highlighted Details

Outperforms PPO and DPO in aligning LLMs like GPT-3.5-turbo and Claude-2.
Achieves orthogonal improvements in LLM alignment.
Released model and dataset on Hugging Face.
Includes code for data construction, model training, inference, and evaluation.

Maintenance & Community

Project associated with ACL 2024.
Codebase references and acknowledges contributions from llm_finetuning, DeepSpeed-Chat, and LLaMA-Factory.

Licensing & Compatibility

The README does not explicitly state a license. The presence of code from other projects with varying licenses may imply specific usage terms.

Limitations & Caveats

The project is presented as research code, and the README includes #TODO comments indicating areas requiring user modification before execution, suggesting it may not be fully production-ready.

BPO by thu-coai

Explore Similar Projects

Toolkit-for-Prompt-Compression by 3DAgentWorld

URIAL by Re-Align

prompt-lookup-decoding by apoorvumang

llm-course-chn by friendmine

PromptAgent by maitrix-org

promptomatix by SalesforceAIResearch

opro by google-deepmind

prompt-ops by meta-llama

OLMo-core by allenai

EAGLE by SafeAILab

AutoPrompt by Eladlev

PromptWizard by microsoft