Fengshenbang-LM by IDEA-CCNL

Chinese foundation model ecosystem for AI infrastructure

Created 4 years ago

4,148 stars

Top 11.7% on SourcePulse

View on GitHub

3 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Junyang Lin

Core Maintainer at Alibaba Qwen

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Project Summary

Fengshenbang-LM is an open-source ecosystem of large models developed by IDEA Institute, aiming to serve as foundational infrastructure for Chinese AI-generated content (AIGC) and cognitive intelligence. It offers a comprehensive suite of pre-trained models, fine-tuned applications, benchmarks, and datasets, catering to researchers and developers focused on Chinese NLP tasks.

How It Works

The project provides a diverse range of models categorized by task type: general (NLU, NLG, NLT), multimodal, and domain-specific. It leverages a foundational model approach, enabling adaptation to various downstream tasks with potentially reduced computational resources. The ecosystem emphasizes continuous upgrades, integrating the latest data and training algorithms to build a standardized, user-centric infrastructure for Chinese NLP.

Quick Start & Requirements

Installation: pip install --editable . (after cloning the repository and initializing submodules). Docker is also provided.
Prerequisites: PyTorch, CUDA (for GPU acceleration), and potentially Hugging Face libraries. Specific models may have varying hardware requirements.
Resources: Training large models requires significant GPU resources. Model inference can be performed on consumer-grade hardware depending on model size.
Links: Fengshenbang-LM GitHub, Huggingface Community

Highlighted Details

Offers models like "Jiang Ziya" (general large models), "Taiyi" (multimodal, including Chinese Stable Diffusion), and "Erlangshen" (NLU, largest Chinese BERT model at release).
Includes the "FengShen" framework for distributed training and fine-tuning, inspired by HuggingFace and Megatron-LM.
Provides command-line pipelines for easy prediction and fine-tuning of various NLP tasks.
Achieved state-of-the-art results on Chinese NLP benchmarks like FewCLUE and ZeroCLUE.

Maintenance & Community

The project is actively maintained by the IDEA Institute's CCNL team. Community engagement is encouraged through WeChat groups and ongoing recruitment.

Licensing & Compatibility

License: Apache License 2.0.
Compatibility: Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

While comprehensive, the project's focus is primarily on Chinese language tasks. Some models and documentation might be more mature for Chinese than English. The rapid evolution of large models means specific model versions may become outdated.

Health Check

Last Commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days