SimpleMem by aiming-lab

Efficient lifelong memory for LLM agents

Created 1 month ago

3,019 stars

Top 15.6% on SourcePulse

Project Summary

SimpleMem addresses the challenge of efficient long-term memory for LLM agents by employing a novel three-stage pipeline grounded in Semantic Lossless Compression. This approach maximizes information density and token utilization, offering a significant benefit to developers building LLM agents that require robust, scalable, and cost-effective memory management. The project targets researchers and engineers working with LLM agents who need to overcome the limitations of passive context accumulation or expensive iterative reasoning for memory handling.

How It Works

SimpleMem's core innovation is a three-stage pipeline designed for semantic lossless compression. Stage 1, Semantic Structured Compression, transforms unstructured dialogue into self-contained atomic facts with resolved coreferences and absolute timestamps, eliminating downstream reasoning overhead. Stage 2, Structured Indexing, organizes memory across semantic (vector embeddings), lexical (keyword index), and symbolic (metadata) layers for multi-granular retrieval. Stage 3, Adaptive Query-Aware Retrieval, dynamically adjusts retrieval scope based on query complexity, balancing comprehensive context with token efficiency. This pipeline maximizes information density and token utilization, offering a superior balance between performance and efficiency compared to existing methods.

Quick Start & Requirements

Installation: Clone the repository (git clone https://github.com/aiming-lab/SimpleMem.git), navigate into the directory, and install dependencies using pip install -r requirements.txt.
Prerequisites: Python 3.10 and an OpenAI-compatible API (e.g., OpenAI, Qwen, Azure OpenAI) with a valid API key are required.
Configuration: Edit config.py to set your API key, desired LLM model (e.g., gpt-4.1-mini), and embedding model (e.g., Qwen/Qwen3-Embedding-0.6B).
Usage: Initialize SimpleMemSystem, add dialogues via add_dialogue(), finalize encoding with finalize(), and query using ask(). Parallel processing options are available for large-scale operations.
Links: Interactive Demo, Paper, GitHub.

Highlighted Details

Achieves a leading 43.24% F1 score on the LoCoMo-10 benchmark with minimal token cost (~550) using GPT-4.1-mini.
Demonstrates superior performance metrics: 92.6s construction time, 388.3s retrieval time, and 480.9s total time, significantly faster than baselines like Mem0 and LightMem.
Key contributions include transforming ambiguous dialogue into absolute, atomic facts, multi-view indexing across semantic, lexical, and symbolic layers, and complexity-aware adaptive retrieval for optimized context depth.
Maintains competitive performance (25.23% Avg F1) even with a significantly smaller model (Qwen2.5-1.5B), showing efficiency gains.

Maintenance & Community

The project has established a Discord server and WeChat group for collaboration and idea exchange. A paper detailing the methodology has been released on arXiv.

Licensing & Compatibility

The project is licensed under the MIT License, which is permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The provided README does not explicitly detail limitations, alpha status, or known bugs. Setup requires configuration of an OpenAI-compatible API key.

SimpleMem by aiming-lab

Explore Similar Projects

EM-LLM-model by em-llm

Diver by AQ-MedAI

LongMemEval by xiaowu0162

memora by agentic-mcp-tools

memlayer by divagr18

LLM4IR-Survey by RUC-NLPIR

LightMem by zjunlp

memento-mcp by gannonh

ReMe by agentscope-ai

MemoRAG by qhjqhj00

agentic-rag-for-dummies by GiovanniPasq

kernel-memory by microsoft