Open-source toolkit for LLM distillation research
Top 49.8% on sourcepulse
DistillKit is an open-source research toolkit for Large Language Model (LLM) distillation, developed by Arcee.AI. It provides practical tools for researchers and developers to improve LLM performance and efficiency through distillation, targeting users who want to enhance smaller models using larger ones.
How It Works
DistillKit offers two primary distillation methods: Logit-based distillation, which uses KL divergence to match the output probability distributions of a teacher and student model (requiring identical architectures), and Hidden States-based distillation, which aligns intermediate layer representations, allowing for cross-architecture distillation and richer guidance.
Quick Start & Requirements
bash ./setup.sh
or manual installation with pip install torch wheel ninja packaging flash-attn deepspeed -r requirements.txt
.accelerate launch distil_logits.py
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
Memory requirements are higher than standard SFT. The project is actively working on scaling to support models larger than 70B parameters. Spectrum integration is noted as TBD for further evaluation.
3 weeks ago
1 week