CPM-Bee  by OpenBMB

Bilingual base model for research/commercial use

Created 2 years ago
2,427 stars

Top 19.0% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

CPM-Bee is a 10-billion parameter, bilingual (Chinese/English) foundational large language model. It is designed for developers and researchers to adapt for specific applications, offering strong base capabilities trained on trillions of high-quality tokens. The model is fully open-source and commercially permissive, aiming to accelerate LLM development and adoption.

How It Works

CPM-Bee utilizes a Transformer auto-regressive architecture. Its key differentiator is the structured JSON data format used during pre-training and for task adaptation. This approach allows for precise semantic understanding and efficient handling of various downstream tasks like fill-in-the-blank, text generation, translation, and question answering, surpassing traditional unstructured text methods.

Quick Start & Requirements

  • Install via pip install -r requirements.txt after cloning the repository.
  • Requires Python >= 3.7 and PyTorch >= 1.10, < 2.0.0. Ensure PyTorch version matches CUDA version.
  • Official documentation and 🤗 Transformers integration are available.

Highlighted Details

  • Achieves top performance on the ZeroCLUE Chinese benchmark and comparable results to LLaMA on English benchmarks.
  • Supports "Decoder Tuning" for performance enhancement without model parameter modification.
  • Offers compressed versions (1B, 2B, 5B, 10B parameters) for various hardware requirements.
  • Inference on consumer GPUs is feasible, with the 10B model requiring ~20GB VRAM.

Maintenance & Community

  • Developed by OpenBMB.
  • Latest updates include VisCPM (multimodal) and 🤗 Transformers support.

Licensing & Compatibility

  • License: "General Model License Agreement - Source Description - Publicity Restriction - Commercial Authorization". Commercial use requires written authorization from cpm@modelbest.cn.

Limitations & Caveats

  • The model's output should be evaluated and validated by the user, as it does not represent the developers' views.
  • Commercial use requires explicit contact and authorization.
Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Simon Willison Simon Willison(Coauthor of Django), and
10 more.

Yi by 01-ai

0%
8k
Open-source bilingual LLMs trained from scratch
Created 1 year ago
Updated 9 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
4 more.

ChatGLM-6B by zai-org

0.0%
41k
Bilingual dialogue language model for research
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.