CPM-Bee  by OpenBMB

Bilingual base model for research/commercial use

created 2 years ago
2,434 stars

Top 19.5% on sourcepulse

GitHubView on GitHub
Project Summary

CPM-Bee is a 10-billion parameter, bilingual (Chinese/English) foundational large language model. It is designed for developers and researchers to adapt for specific applications, offering strong base capabilities trained on trillions of high-quality tokens. The model is fully open-source and commercially permissive, aiming to accelerate LLM development and adoption.

How It Works

CPM-Bee utilizes a Transformer auto-regressive architecture. Its key differentiator is the structured JSON data format used during pre-training and for task adaptation. This approach allows for precise semantic understanding and efficient handling of various downstream tasks like fill-in-the-blank, text generation, translation, and question answering, surpassing traditional unstructured text methods.

Quick Start & Requirements

  • Install via pip install -r requirements.txt after cloning the repository.
  • Requires Python >= 3.7 and PyTorch >= 1.10, < 2.0.0. Ensure PyTorch version matches CUDA version.
  • Official documentation and 🤗 Transformers integration are available.

Highlighted Details

  • Achieves top performance on the ZeroCLUE Chinese benchmark and comparable results to LLaMA on English benchmarks.
  • Supports "Decoder Tuning" for performance enhancement without model parameter modification.
  • Offers compressed versions (1B, 2B, 5B, 10B parameters) for various hardware requirements.
  • Inference on consumer GPUs is feasible, with the 10B model requiring ~20GB VRAM.

Maintenance & Community

  • Developed by OpenBMB.
  • Latest updates include VisCPM (multimodal) and 🤗 Transformers support.

Licensing & Compatibility

  • License: "General Model License Agreement - Source Description - Publicity Restriction - Commercial Authorization". Commercial use requires written authorization from cpm@modelbest.cn.

Limitations & Caveats

  • The model's output should be evaluated and validated by the user, as it does not represent the developers' views.
  • Commercial use requires explicit contact and authorization.
Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.