SEAL by Continual-Intelligence

Framework for self-adapting language models

Created 7 months ago

1,647 stars

Top 25.4% on SourcePulse

Project Summary

SEAL (Self-Adapting LLMs) is a framework for training language models to generate self-edits, such as finetuning data or update directives, in response to new inputs. It targets researchers and practitioners seeking to enable LLMs to continually learn and adapt to new information and tasks without manual intervention, demonstrated in general-knowledge incorporation and few-shot task adaptation.

How It Works

SEAL utilizes Reinforcement Learning (RL) to train language models to produce self-editing actions. This approach allows the model to learn a policy for generating updates based on new data, effectively creating a self-improving loop. The framework is designed to be flexible, supporting adaptation for both factual knowledge integration and few-shot learning scenarios.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Requires Python 3.12 and an OpenAI API key configured in a .env file.
Experiments are designed for 2x A100/H100 GPUs; other configurations may require adjustments.

Highlighted Details

Framework for training LLMs to generate self-edits via RL.
Explored in general-knowledge incorporation and few-shot adaptation domains.
Includes code, data, and documentation for both explored domains.

Maintenance & Community

The project is associated with MIT CSAIL and lists authors Adam Zweiger, Jyothish Pari, Han Guo, Ekin Akyürek, Yoon Kim, and Pulkit Agrawal.

Licensing & Compatibility

The repository does not explicitly state a license.

Limitations & Caveats

The setup and experimental configurations are optimized for specific hardware (2x A100/H100 GPUs) and may require significant refactoring for different setups. An OpenAI API key is a mandatory requirement for operation.

SEAL by Continual-Intelligence

Explore Similar Projects

CoT-Collection by kaistAI

awesome-in-context-rl by dunnolab

HugNLP by HugAILab

UnifiedSKG by xlang-ai

M_GRPO by baibizhe

Visual-RFT by Liuziyu77

train-deepseek-r1 by FareedKhan-dev

self-adaptive-llms by SakanaAI

EasyTransfer by alibaba

EasyNLP by alibaba

RLBench by stepjam

Awesome-Incremental-Learning by xialeiliu