Baichuan-7B by baichuan-inc

7B-parameter LLM for commercial use

Created 2 years ago

5,680 stars

Top 8.8% on SourcePulse

View on GitHub

2 Experts Love This Project

Omar Sanseviero

DevRel at Google DeepMind

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

Baichuan-7B is a 7-billion parameter, bilingual (Chinese/English) large language model developed by BaiChuan-Inc. It is designed for researchers and developers working with LLMs, offering strong performance on Chinese and English benchmarks and supporting commercial use.

How It Works

Built on a Transformer architecture, Baichuan-7B was trained on 1.2 trillion tokens. It utilizes rotary positional embeddings for better extrapolation, SwiGLU activation, and RMSNorm for normalization. The model employs optimized training techniques, including Flash-Attention, operator splitting, mixed precision, and communication optimizations, achieving high throughput on A800 GPUs. Its tokenizer uses Byte-Pair Encoding with optimizations for Chinese language and numerical data.

Quick Start & Requirements

Inference: Use the provided Hugging Face transformers Python code.
Prerequisites: Python, PyTorch, transformers library. GPU recommended for inference.
Training: Requires requirements.txt installation, data preparation, and DeepSpeed configuration.
Resources: Model weights are available on Hugging Face and ModelScope.
Docs: Hugging Face, ModelScope

Highlighted Details

Achieves top results among 7B models on C-Eval (Chinese) and MMLU (English) benchmarks.
Supports a 4096 token context window, with good extrapolation beyond 5000 tokens.
Tokenizer shows improved compression rates for Chinese compared to LLaMA and Falcon.
Training achieved 182 TFLOPS throughput on 1000 A800 GPUs with 58.3% peak utilization.

Maintenance & Community

The project has released Baichuan 2 (7B, 13B) as a successor.
Community resources include WeChat and links to Hugging Face.

Licensing & Compatibility

Source code is licensed under Apache 2.0.
Model usage is permitted for commercial purposes, but requires registration and written authorization from BaiChuan-Inc. via opensource@baichuan-inc.com.

Limitations & Caveats

The README mentions a "Baichuan-7B Model License Agreement" for commercial use, which may contain specific terms beyond the Apache 2.0 license for the code.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days