MemoryLLM by wangyu-ustc

Self-updatable LLMs with scalable long-term memory

Created 1 year ago

284 stars

Top 92.1% on SourcePulse

Project Summary

MemoryLLM and M+ offer official implementations for self-updatable LLMs and scalable long-term memory, addressing the challenge of LLMs retaining and integrating new information. Targeting researchers and engineers, these projects enable LLMs to learn and adapt without full retraining, enhancing their utility in dynamic environments.

How It Works

The core innovation enables LLMs to dynamically update knowledge and maintain scalable long-term memory. MemoryLLM introduces self-updating via context injection, while M+ extends this with efficient long-term memory management. This approach allows models to adapt to evolving information, improving relevance and accuracy in continuous learning.

Quick Start & Requirements

Installation requires Conda (conda create --name memoryllm, conda activate memoryllm) and pip install -r requirements.txt. Testing environments used CUDA 12.2 and H100 GPUs, with torch_dtype=torch.bfloat16 and attn_implementation="flash_attention_2" suggesting specific dependencies. Pre-trained models for MPlus-8B, MemoryLLM-8B, and MemoryLLM-8B-chat are loadable via Python snippets. Specific model branches exist for different versions.

Highlighted Details

Features pre-trained models: MPlus-8B, MemoryLLM-8B (Llama3-based with 1.67B memory), and MemoryLLM-8B-chat.
Enables self-updatable LLMs and scalable long-term memory for continuous learning.
Provides training scripts for Llama2-7B (C4, RedPajama) and OpenLLaMA.
Includes evaluation frameworks for model editing, QA, and long-context benchmarks (Longbench).

Maintenance & Community

The project shows active maintenance with recent updates in July and February 2025, including model and code releases. No specific community channels (e.g., Discord, Slack) or dedicated contributor information beyond paper authors are listed.

Licensing & Compatibility

The README does not explicitly state the software license. This omission requires further investigation for users considering commercial use or integration into closed-source projects.

Limitations & Caveats

Memory injection requires a minimum of 16 tokens; smaller injections may destabilize memory. Training on certain datasets (e.g., Llama2-7B on C4) might yield suboptimal performance on benchmarks like Qasper. High hardware requirements (e.g., H100 GPUs) may be implied for optimal operation.

MemoryLLM by wangyu-ustc

Explore Similar Projects

ai-infra-learning by cr7258

Quest by mit-han-lab

InfLLM by thunlp

attention_sinks by tomaarsen

LightMem by zjunlp

snowflake-arctic by Snowflake-Labs

megalodon by XuezheMax

memento-mcp by gannonh

tiny-llm by skyzh

MemOS by MemTensor

streaming-llm by mit-han-lab

lm-evaluation-harness by EleutherAI