transformer-heads by center-for-humans-and-machines

Toolkit for attaching, training, saving, and loading new heads for transformer models

Created 1 year ago

294 stars

Top 90.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Maxime Labonne

Head of Post-Training at Liquid AI

Project Summary

This library provides a toolkit for attaching, training, saving, and loading custom "heads" onto pre-trained transformer models. It enables researchers and practitioners to easily adapt large language models for new tasks, such as linear probing for interpretability, fine-tuning for classification or regression, and multi-task learning, thereby enhancing model versatility and facilitating efficient experimentation.

How It Works

The core approach involves defining HeadConfig objects that specify the desired head's properties, including its attachment layer, input/output dimensions, activation function, loss function, and target data column. The load_headed function then seamlessly integrates these heads by replacing or augmenting the transformer's original output layer. This modular design allows for flexible experimentation with various downstream tasks and training strategies, including efficient methods like QLoRA.

Quick Start & Requirements

Install from PyPI: pip install transformer-heads
Or clone and install locally: pip install -e .
Requires Python and Hugging Face Transformers.
Notebooks demonstrate usage with GPT-2 and Llama models.
Official documentation and examples are available via provided links.

Highlighted Details

Supports attaching multiple heads simultaneously for multi-task learning.
Integrates with Hugging Face's Trainer for simplified training workflows.
Offers QLoRA support for reduced memory overhead and efficient fine-tuning.
Includes notebooks for linear probing, classification, regression, and joint multi-task learning.

Maintenance & Community

Developed by the Center for Humans and Machines.
Links to documentation, getting started guides, and Reddit discussions are provided.

Licensing & Compatibility

The library is released under an unspecified license. Further clarification on licensing terms is recommended for commercial use or integration into closed-source projects.

Limitations & Caveats

The README does not explicitly state the license, which could be a concern for commercial adoption. Support for custom model architectures relies on them having a similar attribute structure to Hugging Face's LlamaForCausalLM, requiring potential modifications for non-standard models.

transformer-heads by center-for-humans-and-machines

Explore Similar Projects

xlora by EricLBuehler

LLM-Dojo by mst272

naacl_transfer_learning_tutorial by huggingface

multimodal by facebookresearch

EasyTransfer by alibaba

bert4torch by Tongjilibo

spacy-transformers by explosion

BERT-keras by Separius

transformers-tutorials by abhimishra91

FARM by deepset-ai

adapters by adapter-hub

FlagAI by FlagAI-Open