Awesome-Knowledge-Distillation-of-LLMs  by Tebmer

Paper list for LLM knowledge distillation

created 1 year ago
1,115 stars

Top 35.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a curated collection of research papers on Knowledge Distillation (KD) for Large Language Models (LLMs), aimed at researchers and practitioners seeking to transfer capabilities from large proprietary models to smaller ones or enable self-improvement. It provides a structured overview of KD techniques, categorized by algorithms, skill transfer, and domain-specific applications, serving as a comprehensive resource for understanding and implementing LLM distillation.

How It Works

The collection is organized around a taxonomy that breaks down KD into "Knowledge Elicitation" (extracting knowledge from teacher LLMs) and "Distillation Algorithms" (transferring knowledge to student models). It further explores "Skill Distillation" for enhancing specific cognitive abilities (e.g., reasoning, alignment) and "Verticalization Distillation" for domain-specific applications (e.g., law, medicine). This structured approach allows users to navigate the diverse landscape of KD research efficiently.

Quick Start & Requirements

This repository is a curated list of papers and does not involve direct code execution or installation. It serves as a reference guide.

Highlighted Details

  • Comprehensive taxonomy: KD is categorized into Knowledge Elicitation, Distillation Algorithms, Skill Distillation, and Verticalization Distillation.
  • Broad coverage: Includes papers on various KD algorithms (e.g., supervised fine-tuning, reinforcement learning, divergence minimization) and applications across numerous domains.
  • Active updates: The collection is updated weekly, with a recent update on March 19, 2024.
  • Legal considerations: Highlights the importance of adhering to the terms of use for LLM providers.

Maintenance & Community

The repository is maintained by Xiaohan Xu and collaborators, with contact information provided for contributions and feedback. Users are encouraged to open issues/PRs or email to suggest missing papers or taxonomies.

Licensing & Compatibility

The repository itself is not licensed for software use. The linked papers have their own respective licenses and terms of use, which users must adhere to.

Limitations & Caveats

The collection primarily focuses on generative LLMs and explicitly notes that encoder-based KD is not included, though it is being tracked. Some entries may lack direct code links.

Health Check
Last commit

4 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
114 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake) and Andrey Vasnetsov Andrey Vasnetsov(Cofounder of Qdrant).

awesome-knowledge-distillation by dkozlov

0.1%
4k
Collection of knowledge distillation resources
created 8 years ago
updated 1 month ago
Feedback? Help us improve.