Discover and explore top open-source AI tools and projects—updated daily.
SakanaAIInstantly internalize factual context into LLMs using hypernetworks
New!
Top 72.9% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Doc-to-LoRA (D2L) provides a method for Large Language Models (LLMs) to instantly internalize factual information from documents without full retraining. This technique, utilizing hypernetworks, allows LLMs to recall specific, dynamic contexts on demand, benefiting applications requiring up-to-date or specialized knowledge. It targets researchers and developers seeking efficient LLM adaptation for factual recall.
How It Works
D2L employs hypernetworks to dynamically update LLM weights, enabling them to "instantly internalize contexts." This approach avoids costly fine-tuning by injecting factual knowledge directly. The core ModulatedPretrainedModel allows for context internalization via model.internalize(doc) and subsequent removal via model.reset(), influencing generation outputs based on the learned information. This design offers a novel way to manage LLM knowledge dynamically.
Quick Start & Requirements
uv for installation: curl -LsSf https://astral.sh/uv/install.sh | sh then ./install.sh.uv run huggingface-cli login and uv run huggingface-cli download SakanaAI/doc-to-lora --local-dir trained_d2l --include "*/".uv package manager, Hugging Face account and CLI access, PyTorch.Highlighted Details
Maintenance & Community
The project is associated with Sakana AI. No specific community channels (e.g., Discord, Slack) or details on active contributors/sponsorships are provided in the README.
Licensing & Compatibility
The README does not specify the software license. This omission requires further investigation for compatibility, especially for commercial use.
Limitations & Caveats
The current Python API for ModulatedPretrainedModel explicitly states it supports only non-batched inputs. The presence of "experimental scripts" suggests ongoing development and potential instability.
3 days ago
Inactive