AwesomeOPD  by thinkwee

Awesome list for On-Policy Distillation in LLM training

Created 4 weeks ago

New!

463 stars

Top 64.9% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

AwesomeOPD curates open-source repositories and papers on On-Policy Distillation (OPD) and On-Policy Self-Distillation (OPSD) for training LLMs, VLMs, and agents. It offers researchers and engineers a structured overview and detailed annotations, significantly aiding evaluation and adoption decisions in this complex training paradigm.

How It Works

OPD trains a student model by having it sample its own trajectories (y ~ π_student(·|x)) and then supervising these samples with a teacher model, typically via per-token logits. OPSD is a variant where the teacher is the same model, conditioned differently (e.g., privileged context). Entries are annotated across four axes: teacher source, supervision signal, rollout consumption, and pipeline slot, enabling nuanced comparison.

Quick Start & Requirements

This is an "awesome list" of research and projects, not a single installable framework. It provides no direct installation or execution commands. Users must refer to individual linked papers and repositories for specific implementation details, requirements, and setup.

Highlighted Details

  • Features a comprehensive taxonomy (Surveys, White-Box, Black-Box, OPSD, OPD-RL Hybrids, etc.).
  • Each entry is annotated with teacher source, supervision signal, rollout consumption, and pipeline slot.
  • Curation uses LLM agents and manual review, with a disclaimer for potential errors.
  • Last updated April 30, 2026.

Maintenance & Community

Maintained by "AwesomeOPD Contributors" with an open invitation for Pull Requests (PRs). The GitHub repository is the primary community hub.

Licensing & Compatibility

The README does not specify a license for the list. Commercial use or closed-source compatibility depends on the licenses of individual referenced projects.

Limitations & Caveats

The curation process acknowledges "errors are possible." Some entries may be borderline or require deeper analysis to confirm strict OPD adherence (student sampling + teacher supervision). The list intentionally excludes related methods like pure RL or offline distillation.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
5
Issues (30d)
2
Star History
466 stars in the last 28 days

Explore Similar Projects

Starred by Eric Zhang Eric Zhang(Founding Engineer at Modal), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
3 more.

tunix by google

0.5%
2k
JAX-native library for efficient LLM post-training
Created 1 year ago
Updated 5 hours ago
Starred by Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
1 more.

awesome-knowledge-distillation by dkozlov

0.1%
4k
Collection of knowledge distillation resources
Created 9 years ago
Updated 14 hours ago
Feedback? Help us improve.