Black-Box-Tuning by txsun1997

Gradient-free tuning method for Language-Model-as-a-Service

Created 4 years ago

272 stars

Top 94.9% on SourcePulse

View on GitHub

2 Experts Love This Project

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Junyang Lin

Core Maintainer at Alibaba Qwen

Project Summary

This repository provides Black-Box Tuning (BBT) and BBTv2, gradient-free methods for few-shot learning with large language models (LLMs). It enables efficient LLM adaptation for Language-Model-as-a-Service (LMaaS) by optimizing soft prompt tokens without backpropagation, achieving performance comparable to full model tuning. The target audience includes researchers and practitioners working with LLMs who need to fine-tune models without direct gradient access or for deployment scenarios.

How It Works

BBT optimizes a sequence of soft prompt tokens prepended to the LLM input. This approach treats the LLM as a black box, requiring only inference API calls. BBTv2 enhances this by employing a divide-and-conquer strategy to optimize prompts across all layers of the LLM, leading to improved performance on various tasks. Both methods leverage gradient-free optimization techniques, typically achieving good results within a limited number of forward passes.

Quick Start & Requirements

Install: Clone the repository and set up a conda environment:

conda create --name bbt python=3.8
conda activate bbt
pip install transformers==4.1.1 fastNLP==0.6.0 datasets cma sklearn
git clone https://github.com/txsun1997/Black-Box-Tuning
cd Black-Box-Tuning

Prerequisites: Python 3.8, CUDA (tested with 11.4), NVIDIA GPU (tested on 3090).
Running BBT: bash run.sh
Running BBTv2: python deepbbt.py --model_name "roberta-large" --task_name "agnews" ...
Inference Optimization: Requires onnxruntime-gpu==1.10.0. Export and run with --inference_framework 'ort'.
Links: ICML'2022 Paper, EMNLP'2022 Paper

Highlighted Details

Supports BERT, BART, T5, GPT-2, and RoBERTa models.
Achieves ~2x speedup with ONNX Runtime optimization.
Offers parallel evaluation for improved efficiency.
Demonstrates competitive performance on various language understanding datasets.

Maintenance & Community

The project was last updated in October 2022 with the release of BBTv2. Key contributors are listed in the papers. No active community channels (Discord/Slack) are mentioned.

Licensing & Compatibility

The repository does not explicitly state a license. The code uses libraries like transformers and fastNLP, which have their own licenses. Users should verify compatibility for commercial or closed-source use.

Limitations & Caveats

The project's last update was in late 2022, and there's no indication of ongoing maintenance or support for newer LLM architectures or techniques. Installation of onnxruntime-gpu can be challenging due to environment-specific issues.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days