tiny-llm-zh by wdndev

Chinese LLM for learning large language models

Created 1 year ago

923 stars

Top 39.5% on SourcePulse

Project Summary

This project implements small-parameter Chinese Large Language Models (LLMs) from scratch, targeting engineers and researchers for rapid learning of LLM concepts. It provides a complete pipeline from tokenization to deployment, with open-source code and data, enabling a full understanding of LLM development.

How It Works

The project follows a standard LLM architecture, incorporating components like RMSNorm and RoPE. It details a two-stage training process: pre-training (PTM) and instruction fine-tuning (SFT), with optional human alignment (RLHF, DPO). The implementation leverages the Hugging Face Transformers library and DeepSpeed for efficient multi-GPU/multi-node training, supporting various model sizes and an optional Mixture-of-Experts (MoE) architecture.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.8+, PyTorch 2.0+, Transformers 4.37.2+, CUDA 11.4+ (recommended for training).
Usage: Load models via Hugging Face or ModelScope. Example inference code provided for both.
Docs: ModelScope, Hugging Face

Highlighted Details

Offers models ranging from 16M to 1.5B parameters.
Supports vLLM and a modified llama.cpp for efficient inference.
Includes a Mixture-of-Experts (MoE) variant.
Provides a complete pipeline: Tokenizer -> PTM -> SFT -> RLHF/DPO -> Evaluation -> Deployment.

Maintenance & Community

The project is maintained by wdndev. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project prioritizes demonstrating the full LLM pipeline over achieving state-of-the-art performance, resulting in lower evaluation scores and occasional generation errors. The llama.cpp deployment is a modified version and is recommended for Linux environments.

tiny-llm-zh by wdndev

Explore Similar Projects

rho by microsoft

MegaDLMs by JinjieNi

snowflake-arctic by Snowflake-Labs

MINI_LLM by jiahe7ay

mini_qwen by qiufengqijun

train-llm-from-scratch by FareedKhan-dev

LLMBox by RUCAIBox

step_into_llm by mindspore-lab

bert4torch by Tongjilibo

llm_from_scratch by vivekkalyanarangan30

OpenRLHF by OpenRLHF

gpt-neox by EleutherAI