baize-chatbot  by project-baize

Chat model trained via LoRA, using ChatGPT-generated dialogs

created 2 years ago
3,167 stars

Top 15.6% on sourcepulse

GitHubView on GitHub
Project Summary

Baize-chatbot provides open-source chat models fine-tuned on ChatGPT-generated self-chat data, targeting researchers and developers seeking to deploy custom conversational AI. It offers parameter-efficient fine-tuning (PEFT) with LoRA, enabling training and inference on a single GPU, significantly reducing resource requirements compared to full model fine-tuning.

How It Works

Baize leverages LoRA for parameter-efficient fine-tuning of LLaMA base models. It uses 100k dialogs generated by ChatGPT conversing with itself, augmented with Alpaca's dataset. This approach allows for rapid training and deployment of capable chatbots with substantially less VRAM and time than traditional fine-tuning methods.

Quick Start & Requirements

  • Install FastChat: pip install git+https://github.com/lm-sys/FastChat.git
  • Merge LoRA weights (v1 models): python3 -m fastchat.model.apply_lora --base huggyllama/llama-7b --target ./model_weights/baize-7b --lora project-baize/baize-lora-7B
  • Run CLI: python -m fastchat.serve.cli --model-path ./model_weights/baize-7b
  • Requirements: Python 3.8+, LLaMA model access.
  • VRAM: 16GB for 7B inference (26GB for 7B training), 28GB for 13B inference (25GB for 13B training), 67GB for 30B inference (42GB for 30B training). INT8 quantization is available for lower VRAM inference.
  • Docs: FastChat, Demo

Highlighted Details

  • Offers 7B, 13B, and 30B parameter models (v1 and v2).
  • Includes a Healthcare-specific 7B model.
  • Supports merging LoRA weights for standard Hugging Face API compatibility.
  • Provides code for data collection and preprocessing.

Maintenance & Community

  • Active development with releases in May 2023 (v2 models).
  • Integration with FastChat for CLI and API access.
  • Community models and data (e.g., Falcon fine-tuned with Baize data).

Licensing & Compatibility

  • Code: GPL-3.0.
  • Model weights and data: Research use ONLY. Commercial use is strictly prohibited.

Limitations & Caveats

Commercial use of model weights and data is strictly prohibited due to licensing. The project relies on access to LLaMA base models, which have their own usage restrictions. Training is recommended on A100-80G GPUs, though smaller GPUs can be used with reduced batch sizes.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.