Mengzi3 by Langboat

LLM for multilingual generation, especially Chinese

Created 1 year ago

1,371 stars

Top 29.3% on SourcePulse

Project Summary

Mengzi3 is a series of large language models (8B and 13B parameters) based on the Llama architecture, designed to offer strong Chinese language capabilities alongside multilingual support. It targets researchers and developers seeking a performant, commercially viable LLM for various natural language processing tasks.

How It Works

The models are trained on a diverse corpus including web pages, encyclopedias, social media, news, and high-quality open-source datasets, with continued training on trillions of tokens of multilingual data. This approach emphasizes robust Chinese language understanding and generation while maintaining broader multilingual competence.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Usage: Load via Hugging Face transformers library. Example code provided for inference.
Requirements: PyTorch, Hugging Face transformers. GPU recommended for inference.
Links: Hugging Face, ModelScope

Highlighted Details

Mengzi3-13B-Base outperforms comparable models in MMLU (0.651) and CMMLU (0.588) benchmarks.
Achieves state-of-the-art results on OCNLI (0.776) among the compared models.
Mengzi3.5-13B-Base shows significant improvements, reaching 0.776 on MMLU and 0.813 on CMMLU.
Fine-tuning scripts and data format examples are available in finetune_demo.

Maintenance & Community

Developed by Langboat.
Contact for commercial licenses and business cooperation provided.

Licensing & Compatibility

Licensed under Apache 2.0.
Open for academic research and free for commercial use.

Limitations & Caveats

The project disclaims responsibility for any misuse, security issues, or public opinion risks arising from the model's use, emphasizing the need for responsible deployment and adherence to legal guidelines.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days