Black-Box-Tuning  by txsun1997

Gradient-free tuning method for Language-Model-as-a-Service

Created 3 years ago
270 stars

Top 95.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides Black-Box Tuning (BBT) and BBTv2, gradient-free methods for few-shot learning with large language models (LLMs). It enables efficient LLM adaptation for Language-Model-as-a-Service (LMaaS) by optimizing soft prompt tokens without backpropagation, achieving performance comparable to full model tuning. The target audience includes researchers and practitioners working with LLMs who need to fine-tune models without direct gradient access or for deployment scenarios.

How It Works

BBT optimizes a sequence of soft prompt tokens prepended to the LLM input. This approach treats the LLM as a black box, requiring only inference API calls. BBTv2 enhances this by employing a divide-and-conquer strategy to optimize prompts across all layers of the LLM, leading to improved performance on various tasks. Both methods leverage gradient-free optimization techniques, typically achieving good results within a limited number of forward passes.

Quick Start & Requirements

  • Install: Clone the repository and set up a conda environment:
    conda create --name bbt python=3.8
    conda activate bbt
    pip install transformers==4.1.1 fastNLP==0.6.0 datasets cma sklearn
    git clone https://github.com/txsun1997/Black-Box-Tuning
    cd Black-Box-Tuning
    
  • Prerequisites: Python 3.8, CUDA (tested with 11.4), NVIDIA GPU (tested on 3090).
  • Running BBT: bash run.sh
  • Running BBTv2: python deepbbt.py --model_name "roberta-large" --task_name "agnews" ...
  • Inference Optimization: Requires onnxruntime-gpu==1.10.0. Export and run with --inference_framework 'ort'.
  • Links: ICML'2022 Paper, EMNLP'2022 Paper

Highlighted Details

  • Supports BERT, BART, T5, GPT-2, and RoBERTa models.
  • Achieves ~2x speedup with ONNX Runtime optimization.
  • Offers parallel evaluation for improved efficiency.
  • Demonstrates competitive performance on various language understanding datasets.

Maintenance & Community

The project was last updated in October 2022 with the release of BBTv2. Key contributors are listed in the papers. No active community channels (Discord/Slack) are mentioned.

Licensing & Compatibility

The repository does not explicitly state a license. The code uses libraries like transformers and fastNLP, which have their own licenses. Users should verify compatibility for commercial or closed-source use.

Limitations & Caveats

The project's last update was in late 2022, and there's no indication of ongoing maintenance or support for newer LLM architectures or techniques. Installation of onnxruntime-gpu can be challenging due to environment-specific issues.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
8 more.

EAGLE by SafeAILab

10.6%
2k
Speculative decoding research paper for faster LLM inference
Created 1 year ago
Updated 1 week ago
Starred by Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
1 more.

textgrad by zou-group

0.7%
3k
Autograd engine for textual gradients, enabling LLM-driven optimization
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.