entropix  by xjdr-alt

Research project for entropy-based context-aware sampling & parallel CoT decoding

created 10 months ago
3,402 stars

Top 14.6% on sourcepulse

GitHubView on GitHub
Project Summary

This project explores entropy-based sampling for large language models, aiming to improve inference quality by making sampling context-aware. It targets researchers and developers seeking to enhance LLM reasoning and output through novel sampling techniques, potentially simulating advanced CoT capabilities.

How It Works

Entropix leverages entropy and "varentropy" (variance in entropy) as signals to guide the sampling process. High entropy suggests uncertainty and potential for exploration, while low entropy indicates a more predictable path. The sampler aims to navigate these states to achieve more nuanced and contextually relevant text generation, akin to advanced chain-of-thought prompting.

Quick Start & Requirements

  • Install: poetry install
  • Prerequisites: Python 3.x, Poetry, Rust (for tiktoken), Hugging Face CLI (for model weights), CUDA (implied for GPU usage).
  • Setup: Requires downloading model weights and tokenizer files.
  • Run: PYTHONPATH=. poetry run python entropix/main.py (JAX) or PYTHONPATH=. poetry run python entropix/torch_main.py (PyTorch).
  • Docs: [Not explicitly linked, but implied by the structure.]

Highlighted Details

  • Supports Llama 3.1+ models, with plans for DeepSeek V2 and Mistral Large.
  • Offers both JAX (for TPU) and PyTorch (for GPU) implementations.
  • Includes notes on disabling JAX JIT for faster iteration and managing VRAM.
  • Future plans include splitting into entropix-local (single GPU, Metal) and entropix (multi-GPU, TPU) repos, plus a training component.

Maintenance & Community

  • The project is described as a research work-in-progress with active development and plans for significant restructuring.
  • Author is active on X (@_xjdr).
  • Acknowledges contributions from several individuals for compute and development support.

Licensing & Compatibility

  • No license is explicitly stated in the README.
  • Compatibility for commercial use or closed-source linking is undetermined.

Limitations & Caveats

The project is explicitly labeled "HERE BE DRAGONS!!!! THIS IS NOT A FINISHED PRODUCT AND WILL BE UNSTABLE AS HELL RIGHT NOW." Significant restructuring is planned, and PRs are temporarily discouraged. The current state may be partially broken with an unmerged backlog.

Health Check
Last commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
61 stars in the last 90 days

Explore Similar Projects

Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), Nathan Lambert Nathan Lambert(AI Researcher at AI2), and
1 more.

unified-io-2 by allenai

0.3%
619
Unified-IO 2 code for training, inference, and demo
created 1 year ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Alex Cheema Alex Cheema(Cofounder of EXO Labs), and
1 more.

recurrent-pretraining by seal-rg

0.1%
806
Pretraining code for depth-recurrent language model research
created 5 months ago
updated 2 weeks ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.