ml-diffucoder  by apple

Diffusion models for code generation

Created 2 months ago
733 stars

Top 47.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides DiffuCoder, a diffusion-based large language model for code generation, addressing limitations in existing diffusion LLMs' generation patterns and post-training strategies. It targets researchers and developers in AI code generation, offering potentially faster generation than autoregressive models and improved performance through novel techniques.

How It Works

DiffuCoder builds upon Masked Denoising Models (MDMs) and diffusion LLMs (dLLMs), investigating how their generation patterns differ from autoregressive models. It introduces a new metric, the "autoregressiveness score," to quantify causal patterns during generation. A key innovation is Coupled-GRPO, a post-training method that addresses inefficiencies in per-timestep loss computation by using a coupled-sampling scheme. This scheme ensures all tokens receive a learning signal and improves probability estimates by evaluating tokens in partially-masked contexts, offering better accuracy with modest computational overhead.

Quick Start & Requirements

  • Installation: Clone huggingface/open-r1, merge provided files, and set up the environment using conda and pip (e.g., pip install vllm==0.8.4, flash-attn==2.8.0.post1, setuptools, .[code]).
  • Prerequisites: Python 3.11, CUDA, E2B API token (for code sandbox), wandb account.
  • Data Preparation: Requires TIGER-Lab/AceCode-89K dataset for GRPO training.
  • Resources: Training requires a code sandbox and wandb for logging. Inference requires a CUDA-enabled GPU.
  • Links: Open-R1, Huggingface Models, Paper

Highlighted Details

  • Models are available on Huggingface (Base, Instruct, cpGRPO).
  • Supports inference with configurable TOKEN_PER_STEP for performance/speed trade-offs.
  • Implements Coupled-GRPO for improved diffusion LLM training.
  • Code evaluation leverages Qwen2.5-Coder.

Maintenance & Community

  • Project is associated with Apple and research contributions from multiple authors.
  • Updates mention ongoing MLX support for Apple Silicon.
  • Code is based on huggingface/open-r1 and LLaMA-Factory.

Licensing & Compatibility

  • The README does not explicitly state a license. The underlying open-r1 project is Apache 2.0 licensed. Compatibility for commercial use or closed-source linking requires clarification.

Limitations & Caveats

MLX support for Apple Silicon is listed as "in progress" as of June 2025, indicating potential limitations for users on that platform. The specific license for this repository is not clearly stated, which could impact commercial adoption.

Health Check
Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
3
Star History
29 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Pawel Garbacki Pawel Garbacki(Cofounder of Fireworks AI), and
4 more.

alpaca_farm by tatsu-lab

0.1%
826
RLHF simulation framework for accessible instruction-following/alignment research
Created 2 years ago
Updated 1 year ago
Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
8 more.

EAGLE by SafeAILab

10.6%
2k
Speculative decoding research paper for faster LLM inference
Created 1 year ago
Updated 1 week ago
Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Ross Taylor Ross Taylor(Cofounder of General Reasoning; Cocreator of Papers with Code), and
11 more.

open-instruct by allenai

0.7%
3k
Training codebase for instruction-following language models
Created 2 years ago
Updated 17 hours ago
Feedback? Help us improve.