memory  by facebookresearch

Reference implementation for memory layers research paper

Created 9 months ago
343 stars

Top 80.6% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides a reference implementation for "Memory Layers at Scale," a technique that enhances large language models by incorporating trainable key-value lookup mechanisms. This approach allows models to store and retrieve information efficiently without increasing computational cost (FLOPs), benefiting researchers and engineers working on large-scale language modeling.

How It Works

Memory layers augment dense feed-forward networks with sparse activation capabilities, enabling dedicated capacity for information storage and retrieval. The core implementation resides in lingua/product_key, featuring memory.py for the central logic, colwise_embeddingbag.py for memory parallelization, and xformer_embeddingbag.py for optimized embedding lookups. This design aims to complement compute-intensive layers by providing a cost-effective way to manage and access information.

Quick Start & Requirements

  • Installation: Clone the repository and run bash setup/create_env.sh or sbatch setup/create_env.sh for SLURM. Activate the environment with conda activate lingua_.
  • Data Preparation: Use python setup/download_prepare_hf_data.py <dataset_name> (e.g., fineweb_edu) to download and prepare data.
  • Tokenizer: Download tokenizers with python setup/download_tokenizer.py <tokenizer_name> (e.g., llama3).
  • Training: Launch jobs using torchrun locally (e.g., torchrun --nproc-per-node 8 -m apps.main.train config=apps/main/configs/pkplus_373m_1024k.yaml) or via SLURM using python -m lingua.stool.
  • Prerequisites: Requires a SLURM cluster for distributed training, Python, and potentially specific hardware configurations as indicated by the need to adapt provided YAML templates.
  • Documentation: Refer to the Meta Lingua README for more instructions.

Highlighted Details

  • Implements trainable key-value lookup mechanisms to augment LLMs.
  • Offers a cost-effective method for information storage and retrieval without increasing FLOPs.
  • Provides parallelization and optimized embedding bag implementations.
  • Includes scripts for environment setup, data preparation, and tokenizer downloads.

Maintenance & Community

The project is from Meta AI (facebookresearch). It is based on the Meta Lingua codebase. Further community interaction details are not explicitly provided in the README.

Licensing & Compatibility

Licensed under the CC-BY-NC license. This license restricts commercial use and derivative works intended for commercial purposes.

Limitations & Caveats

The provided configurations are templates requiring user adaptation for specific environments and data paths. The CC-BY-NC license prohibits commercial use, limiting its applicability for many industry applications.

Health Check
Last Commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Coauthor of SGLang) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

llm-analysis by cli99

0.4%
455
CLI tool for LLM latency/memory analysis during training/inference
Created 2 years ago
Updated 5 months ago
Feedback? Help us improve.