Memento by Agent-on-the-Fly

Fine-tune LLM agents without LLM weight updates

Created 6 months ago

2,136 stars

Top 20.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Project Summary

Memento addresses the challenge of continually improving Large Language Model (LLM) agents without the need for computationally expensive fine-tuning of the LLM weights themselves. It offers a memory-based, continual-learning framework that enables agents to learn from experience, making them more adaptable and efficient. The target audience includes researchers and developers working with LLM agents who need to enhance their performance over time through practical application. The primary benefit is achieving agent improvement with significantly reduced computational cost and complexity.

How It Works

Memento reframes continual learning as memory-based online reinforcement learning within a memory-augmented Markov Decision Process (MDP). It employs a case-based reasoning (CBR) approach where a neural case-selection policy guides actions. Experiences, represented as successful or failed trajectories, are stored in a "Case Bank" and efficiently retrieved for reuse. This memory-augmented learning allows the agent to steer its planning and execution based on past experiences, facilitating low-cost, transferable, and online continual learning. The architecture features a two-stage planner–executor loop: a Meta-Planner decomposes tasks and retrieves relevant cases, while an Executor runs subtasks using a unified MCP (Meta-Controller Protocol) interface, orchestrating various tools and logging outcomes.

Quick Start & Requirements

Installation: Clone the repository, create and activate a Conda environment (conda create -n Memento python=3.11 -y, conda activate Memento), navigate to the client directory, create a .env file for API keys, and install dependencies (pip install -r requirements.txt, pip install -U crawl4ai crawl4ai-setup crawl4ai-doctor playwright install).
Prerequisites: Python 3.10+, OpenAI API key (or compatible endpoint), SearxNG instance for web search. Optional API keys include Chunkr, Jina, and AssemblyAI.
Setup: Requires setting up API keys in a .env file and potentially running a SearxNG Docker container.
Docs/Demo: No explicit links provided in the README for quick-start or demo, but the project structure and usage examples are detailed.

Highlighted Details

Achieves competitive results on benchmarks like GAIA (87.88% Val, 79.40% Test), DeepResearcher (66.6% F1), SimpleQA (95.0%), and HLE (24.4% PM).
Demonstrates accuracy improvements on Out-of-Distribution (OOD) datasets.
Features a comprehensive tool ecosystem including web research, document processing, code execution, and media analysis via a unified MCP interface.
Employs a two-stage planner–executor loop, with a CBR-driven planner for task decomposition and retrieval.

Maintenance & Community

The project acknowledges contributions from Camel-AI for some toolkits and interpreters. Information on specific maintainers, community channels (like Discord/Slack), or a public roadmap is not detailed in the provided README. Contributing guidelines are mentioned.

Licensing & Compatibility

The README does not explicitly state the license type. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Long-horizon tasks, particularly GAIA Level-3, remain challenging due to compounding errors. Performance on frontier knowledge tasks is limited by the current tooling. Open-source coverage for executor validation in fully open pipelines is limited.

Health Check

Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

71 stars in the last 30 days