DRUGS by EGjoni

Inference technique for increasing variety in generative models

Created 2 years ago

360 stars

Top 77.9% on SourcePulse

View on GitHub

2 Experts Love This Project

Amanpreet Singh

Cofounder of Contextual AI

Johannes Hagemann

Cofounder of Prime Intellect

Project Summary

DRµGS introduces Deep Random Micro-Glitch Sampling, a novel method for enhancing generative model output variety and coherence by injecting noise directly into transformer layers during inference. This approach targets researchers and developers working with large language models, offering a more intuitive and potentially effective alternative to traditional sampling techniques.

How It Works

DRµGS inverts the standard generative process by injecting noise into transformer layers rather than using noise to sample from model predictions. This allows the model's later layers to correct or account for perturbations in earlier layers, theoretically improving coherence. The library supports injecting noise into hidden states (H), queries (Q), keys (V), and attention outputs (A), with configurable "dose" and "depth" parameters to control the injection's intensity and location.

Quick Start & Requirements

Install via pip: pip install git+https://github.com/EGjoni/DRUGS.git
Requires PyTorch and Hugging Face Transformers.
Supports LLaMA and Mistral model architectures.
Interactive demos and exploration tools are available via provided links.

Highlighted Details

Offers five types of noise injection: H, Q, K, V, and A.
Includes a cold_shower function to mitigate potential cumulative noise effects on KV caching.
Provides tools for visualizing the impact of noise injection across different layers and dosages.
Experiments suggest noise in earlier layers is often corrected by subsequent layers, with notable divergence spikes in middle layers.

Maintenance & Community

The project is actively maintained by EGjoni, with an open invitation for contributions and experimentation. Links to community discussions or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify licensing terms for commercial or closed-source integration.

Limitations & Caveats

The current proof-of-concept primarily supports LLaMA and Mistral models. While the cold_shower function is provided to address theoretical negative side effects from prolonged noise injection, its necessity and impact on performance are still under investigation.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days