Bilingual base model for research/commercial use
Top 19.5% on sourcepulse
CPM-Bee is a 10-billion parameter, bilingual (Chinese/English) foundational large language model. It is designed for developers and researchers to adapt for specific applications, offering strong base capabilities trained on trillions of high-quality tokens. The model is fully open-source and commercially permissive, aiming to accelerate LLM development and adoption.
How It Works
CPM-Bee utilizes a Transformer auto-regressive architecture. Its key differentiator is the structured JSON data format used during pre-training and for task adaptation. This approach allows for precise semantic understanding and efficient handling of various downstream tasks like fill-in-the-blank, text generation, translation, and question answering, surpassing traditional unstructured text methods.
Quick Start & Requirements
pip install -r requirements.txt
after cloning the repository.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 years ago
1 day