RetroMAE  by staoxiao

Code for retrieval-oriented language model pre-training via masked auto-encoders

created 2 years ago
264 stars

Top 97.5% on sourcepulse

GitHubView on GitHub
Project Summary

RetroMAE provides a codebase for pre-training and fine-tuning retrieval-oriented language models using a Masked Auto-Encoder approach. It targets researchers and practitioners in information retrieval and natural language processing, offering state-of-the-art performance on benchmarks like MS MARCO and BEIR.

How It Works

RetroMAE employs a Masked Auto-Encoder (MAE) strategy for pre-training, which reconstructs masked tokens. This approach, particularly in its v2 iteration (Duplex MAE), is designed to enhance the transferability and zero-shot capabilities of dense retrievers, leading to improved performance on both in-domain and out-of-domain datasets.

Quick Start & Requirements

  • Install via pip: pip install . or pip install -e . for development.
  • Requires PyTorch.
  • Pre-trained models are available on Huggingface Hub (e.g., Shitao/RetroMAE).
  • Example workflows for pre-training and fine-tuning are provided.

Highlighted Details

  • Achieves SOTA performance on MS MARCO and BEIR benchmarks.
  • Offers improved zero-shot performance on out-of-domain datasets.
  • Supports fine-tuning via distillation from cross-encoders.
  • RetroMAE v2 is available on arXiv.

Maintenance & Community

  • The project is associated with EMNLP 2022.
  • Citation details are provided.

Licensing & Compatibility

  • The README does not explicitly state a license.

Limitations & Caveats

  • The project is primarily focused on PyTorch and may require adaptation for other frameworks.
  • Specific hardware requirements for pre-training (e.g., multiple GPUs) are implied by the torchrun commands.
Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

fms-fsdp by foundation-model-stack

0.4%
258
Efficiently train foundation models with PyTorch
created 1 year ago
updated 1 week ago
Feedback? Help us improve.