memory  by facebookresearch

Reference implementation for memory layers research paper

Created 11 months ago
353 stars

Top 79.0% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides a reference implementation for "Memory Layers at Scale," a technique that enhances large language models by incorporating trainable key-value lookup mechanisms. This approach allows models to store and retrieve information efficiently without increasing computational cost (FLOPs), benefiting researchers and engineers working on large-scale language modeling.

How It Works

Memory layers augment dense feed-forward networks with sparse activation capabilities, enabling dedicated capacity for information storage and retrieval. The core implementation resides in lingua/product_key, featuring memory.py for the central logic, colwise_embeddingbag.py for memory parallelization, and xformer_embeddingbag.py for optimized embedding lookups. This design aims to complement compute-intensive layers by providing a cost-effective way to manage and access information.

Quick Start & Requirements

  • Installation: Clone the repository and run bash setup/create_env.sh or sbatch setup/create_env.sh for SLURM. Activate the environment with conda activate lingua_.
  • Data Preparation: Use python setup/download_prepare_hf_data.py <dataset_name> (e.g., fineweb_edu) to download and prepare data.
  • Tokenizer: Download tokenizers with python setup/download_tokenizer.py <tokenizer_name> (e.g., llama3).
  • Training: Launch jobs using torchrun locally (e.g., torchrun --nproc-per-node 8 -m apps.main.train config=apps/main/configs/pkplus_373m_1024k.yaml) or via SLURM using python -m lingua.stool.
  • Prerequisites: Requires a SLURM cluster for distributed training, Python, and potentially specific hardware configurations as indicated by the need to adapt provided YAML templates.
  • Documentation: Refer to the Meta Lingua README for more instructions.

Highlighted Details

  • Implements trainable key-value lookup mechanisms to augment LLMs.
  • Offers a cost-effective method for information storage and retrieval without increasing FLOPs.
  • Provides parallelization and optimized embedding bag implementations.
  • Includes scripts for environment setup, data preparation, and tokenizer downloads.

Maintenance & Community

The project is from Meta AI (facebookresearch). It is based on the Meta Lingua codebase. Further community interaction details are not explicitly provided in the README.

Licensing & Compatibility

Licensed under the CC-BY-NC license. This license restricts commercial use and derivative works intended for commercial purposes.

Limitations & Caveats

The provided configurations are templates requiring user adaptation for specific environments and data paths. The CC-BY-NC license prohibits commercial use, limiting its applicability for many industry applications.

Health Check
Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Coauthor of SGLang) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

llm-analysis by cli99

0%
461
CLI tool for LLM latency/memory analysis during training/inference
Created 2 years ago
Updated 6 months ago
Feedback? Help us improve.