DRUGS  by EGjoni

Inference technique for increasing variety in generative models

created 1 year ago
350 stars

Top 80.6% on sourcepulse

GitHubView on GitHub
Project Summary

DRµGS introduces Deep Random Micro-Glitch Sampling, a novel method for enhancing generative model output variety and coherence by injecting noise directly into transformer layers during inference. This approach targets researchers and developers working with large language models, offering a more intuitive and potentially effective alternative to traditional sampling techniques.

How It Works

DRµGS inverts the standard generative process by injecting noise into transformer layers rather than using noise to sample from model predictions. This allows the model's later layers to correct or account for perturbations in earlier layers, theoretically improving coherence. The library supports injecting noise into hidden states (H), queries (Q), keys (V), and attention outputs (A), with configurable "dose" and "depth" parameters to control the injection's intensity and location.

Quick Start & Requirements

  • Install via pip: pip install git+https://github.com/EGjoni/DRUGS.git
  • Requires PyTorch and Hugging Face Transformers.
  • Supports LLaMA and Mistral model architectures.
  • Interactive demos and exploration tools are available via provided links.

Highlighted Details

  • Offers five types of noise injection: H, Q, K, V, and A.
  • Includes a cold_shower function to mitigate potential cumulative noise effects on KV caching.
  • Provides tools for visualizing the impact of noise injection across different layers and dosages.
  • Experiments suggest noise in earlier layers is often corrected by subsequent layers, with notable divergence spikes in middle layers.

Maintenance & Community

The project is actively maintained by EGjoni, with an open invitation for contributions and experimentation. Links to community discussions or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify licensing terms for commercial or closed-source integration.

Limitations & Caveats

The current proof-of-concept primarily supports LLaMA and Mistral models. While the cold_shower function is provided to address theoretical negative side effects from prolonged noise injection, its necessity and impact on performance are still under investigation.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.