DRUGS  by EGjoni

Inference technique for increasing variety in generative models

Created 1 year ago
357 stars

Top 78.2% on SourcePulse

GitHubView on GitHub
Project Summary

DRµGS introduces Deep Random Micro-Glitch Sampling, a novel method for enhancing generative model output variety and coherence by injecting noise directly into transformer layers during inference. This approach targets researchers and developers working with large language models, offering a more intuitive and potentially effective alternative to traditional sampling techniques.

How It Works

DRµGS inverts the standard generative process by injecting noise into transformer layers rather than using noise to sample from model predictions. This allows the model's later layers to correct or account for perturbations in earlier layers, theoretically improving coherence. The library supports injecting noise into hidden states (H), queries (Q), keys (V), and attention outputs (A), with configurable "dose" and "depth" parameters to control the injection's intensity and location.

Quick Start & Requirements

  • Install via pip: pip install git+https://github.com/EGjoni/DRUGS.git
  • Requires PyTorch and Hugging Face Transformers.
  • Supports LLaMA and Mistral model architectures.
  • Interactive demos and exploration tools are available via provided links.

Highlighted Details

  • Offers five types of noise injection: H, Q, K, V, and A.
  • Includes a cold_shower function to mitigate potential cumulative noise effects on KV caching.
  • Provides tools for visualizing the impact of noise injection across different layers and dosages.
  • Experiments suggest noise in earlier layers is often corrected by subsequent layers, with notable divergence spikes in middle layers.

Maintenance & Community

The project is actively maintained by EGjoni, with an open invitation for contributions and experimentation. Links to community discussions or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify licensing terms for commercial or closed-source integration.

Limitations & Caveats

The current proof-of-concept primarily supports LLaMA and Mistral models. While the cold_shower function is provided to address theoretical negative side effects from prolonged noise injection, its necessity and impact on performance are still under investigation.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
Created 4 years ago
Updated 8 months ago
Starred by Benjamin Bolte Benjamin Bolte(Cofounder of K-Scale Labs), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
10 more.

consistency_models by openai

0.1%
6k
PyTorch code for consistency models research paper
Created 2 years ago
Updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Eric Zhang Eric Zhang(Founding Engineer at Modal), and
13 more.

flux by black-forest-labs

0.2%
24k
Inference code for FLUX image generation & editing models
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.