MiniLLM build from scratch (pretrain+sft+dpo)
Top 66.9% on sourcepulse
This repository provides a practical guide and framework for building a small-scale Large Language Model (LLM) from scratch, covering pre-training, supervised fine-tuning (SFT), and Direct Preference Optimization (DPO). It targets researchers and developers interested in understanding and replicating the LLM training pipeline with manageable resources, offering reproducible results and readily usable checkpoints.
How It Works
The project leverages the bert4torch
and torch4keras
training frameworks, designed for concise and efficient LLM development. It emphasizes seamless integration with the Hugging Face transformers
library for inference, optimized data loading for reduced memory footprint, and comprehensive logging for reproducibility. The approach allows for the creation of chat-capable models with customizable attributes like robot names.
Quick Start & Requirements
pip install git+https://github.com/Tongjilibo/torch4keras.git
pip install git+https://github.com/Tongjilibo/bert4torch.git@dev
torchrun
and potentially disabling NCCL for distributed training (export NCCL_IB_DISABLE=1
).infer.py
scripts or converted checkpoints with transformers
.Highlighted Details
Maintenance & Community
The project is actively maintained, with recent updates including new SFT models and multi-turn dialogue support. A WeChat group is available for community discussion (invitation required).
Licensing & Compatibility
The repository's licensing is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification on the specific license terms.
Limitations & Caveats
The project explicitly states that current models possess only basic chat functionality and are not capable of answering complex questions due to limitations in corpus size, model scale, and SFT data quality. The DPO stage is still under testing.
4 months ago
1 day