ntkmirror by leochlon

LoRA-free fine-tuning for causal language models

Created 2 months ago

347 stars

Top 79.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Pawel Garbacki

Cofounder of Fireworks AI

Project Summary

NTK-Mirror offers a LoRA-free method for fine-tuning Hugging Face causal language models by learning a small, signed controller on top of a frozen Transformer. This approach is beneficial for users seeking efficient adaptation without permanent weight modifications, targeting researchers and power users working with LLMs.

How It Works

The core mechanism involves learning a sparse set of shared log-gates applied to decoder-layer output channels. These gates, represented as exp(s_layer, channel), modulate the hidden states h. The controller is trained using teacher-forced examples and then attached to the base model during inference. This method avoids introducing LoRA modules or altering the base model's weights, operating purely within the forward pass for efficient, modular adaptation.

Quick Start & Requirements

Installation: Clone the repository (git clone https://github.com/leochlon/ntkmirror.git), navigate into the directory (cd ntkmirror), and install in editable mode (pip install -e .).
Prerequisites: Hugging Face causal language models (e.g., Qwen2.5-0.5B-Instruct). GPU is recommended for performance, as indicated by .cuda() in the Python API example.
Dependencies: transformers, torch. Install optional dataset dependencies with pip install -e '.[datasets]'.
Links:
- Repository: https://github.com/leochlon/ntkmirror.git
- Demo: bash examples/run_demo.sh
- Documentation: docs/composability.md, docs/persistent_memory.md, docs/method.md

Highlighted Details

LoRA-Free Fine-tuning: Achieves model adaptation without LoRA modules or permanent weight edits, using only a small controller.
Controller Composition: Signed log-gates allow for straightforward composition of multiple task controllers, enabling additive task learning.
Persistent Memory: Enables storing and retrieving controllers as memory items (e.g., per conversation or user), which are composed and injected via the forward pass.
Activation Space Control: Operates by modulating activations, offering an alternative to weight-space modifications like LoRA.

Maintenance & Community

The project is authored by Leon Chlon and associated with Hassana Labs. No specific community channels (like Discord or Slack) or detailed roadmap information are provided in the README.

Licensing & Compatibility

The project is released under the MIT License, which is permissive and generally suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

This package intentionally omits the full research harness for advanced diagnostics (e.g., NTK-vector diagnostics, oracle SGD-displacement fitting). The default retriever for persistent memory is a basic TF-IDF scorer, with recommendations to replace it with embedding-based solutions for production use. The documentation mentions "failure modes," suggesting areas that may require further investigation or represent known limitations.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days