MemoryLLM  by wangyu-ustc

Self-updatable LLMs with scalable long-term memory

Created 1 year ago
252 stars

Top 99.6% on SourcePulse

GitHubView on GitHub
Project Summary

MemoryLLM and M+ offer official implementations for self-updatable LLMs and scalable long-term memory, addressing the challenge of LLMs retaining and integrating new information. Targeting researchers and engineers, these projects enable LLMs to learn and adapt without full retraining, enhancing their utility in dynamic environments.

How It Works

The core innovation enables LLMs to dynamically update knowledge and maintain scalable long-term memory. MemoryLLM introduces self-updating via context injection, while M+ extends this with efficient long-term memory management. This approach allows models to adapt to evolving information, improving relevance and accuracy in continuous learning.

Quick Start & Requirements

Installation requires Conda (conda create --name memoryllm, conda activate memoryllm) and pip install -r requirements.txt. Testing environments used CUDA 12.2 and H100 GPUs, with torch_dtype=torch.bfloat16 and attn_implementation="flash_attention_2" suggesting specific dependencies. Pre-trained models for MPlus-8B, MemoryLLM-8B, and MemoryLLM-8B-chat are loadable via Python snippets. Specific model branches exist for different versions.

Highlighted Details

  • Features pre-trained models: MPlus-8B, MemoryLLM-8B (Llama3-based with 1.67B memory), and MemoryLLM-8B-chat.
  • Enables self-updatable LLMs and scalable long-term memory for continuous learning.
  • Provides training scripts for Llama2-7B (C4, RedPajama) and OpenLLaMA.
  • Includes evaluation frameworks for model editing, QA, and long-context benchmarks (Longbench).

Maintenance & Community

The project shows active maintenance with recent updates in July and February 2025, including model and code releases. No specific community channels (e.g., Discord, Slack) or dedicated contributor information beyond paper authors are listed.

Licensing & Compatibility

The README does not explicitly state the software license. This omission requires further investigation for users considering commercial use or integration into closed-source projects.

Limitations & Caveats

Memory injection requires a minimum of 16 tokens; smaller injections may destabilize memory. Training on certain datasets (e.g., Llama2-7B on C4) might yield suboptimal performance on benchmarks like Qasper. High hardware requirements (e.g., H100 GPUs) may be implied for optimal operation.

Health Check
Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
22 stars in the last 30 days

Explore Similar Projects

Starred by Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
5 more.

streaming-llm by mit-han-lab

0.1%
7k
Framework for efficient LLM streaming
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.