dFactory by inclusionAI

Fine-tuning framework for large language models

Created 8 months ago

262 stars

Top 97.0% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

Summary

dFactory offers an easy and efficient framework for fine-tuning large language models (LLMs), particularly Mixture-of-Experts (MoE) architectures. It targets engineers and researchers seeking to customize LLMs, providing significant performance benefits through optimized weight handling and integrated fine-tuning methods.

How It Works

The core innovation is its efficient management of MoE models via a "merged-expert" weight format. This consolidates individual expert weights into single tensors, enabling substantial speedups through batched matrix multiplication on GPUs. dFactory includes utilities (moe_convertor.py) to convert models between the standard Hugging Face "separate-expert" format and this optimized "merged-expert" format, facilitating both training and inference. It supports continuous supervised fine-tuning (SFT) with methods like block-diffusion and full attention.

Quick Start & Requirements

Installation is recommended via uv (uv sync --extra gpu) or pip. The process involves cloning the repo, setting up the environment, downloading base model weights, and converting them to the "merged-expert" format using provided scripts (./scripts/download_hf_model.py, scripts/moe_convertor.py --mode merge). Training data requires preparation (e.g., ./scripts/build_gsm8k_dataset.py). Fine-tuning starts by modifying configuration files (e.g., configs/sft/llada2_mini_bd_sft.yaml) and running the train.sh script. A tutorial is available at https://inclusionai.github.io/dFactory/.

Highlighted Details

Supports LLaDA2.0-mini (16B) and LLaDA2.0-flash (100B) models.
Integrates continuous supervised fine-tuning (SFT) with block-diffusion and full attention.
Optimizes MoE performance via a novel "merged-expert" weight format for faster computation.

Maintenance & Community

The project is actively developed, with a roadmap including comprehensive documentation and trainable parallel decoding. No specific community channels are listed.

Licensing & Compatibility

Licensed under the Apache 2.0 license, allowing broad compatibility with commercial and closed-source applications.

Limitations & Caveats

Comprehensive documentation is still in progress. Features like trainable parallel decoding are planned for future releases. The workflow requires explicit steps for converting model weights between separate and merged formats, adding complexity to setup and inference.

Health Check

Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days