LAMA  by facebookresearch

Language model probe for factual/commonsense knowledge analysis (research paper)

created 6 years ago
1,386 stars

Top 29.8% on sourcepulse

GitHubView on GitHub
Project Summary

LAMA is a probe for analyzing factual and commonsense knowledge within pretrained language models. It offers a unified interface to query models like BERT, RoBERTa, ELMo, and Transformer-XL, enabling researchers and practitioners to assess model capabilities and extract knowledge.

How It Works

LAMA operates by presenting language models with cloze-style prompts (e.g., "The capital of France is [MASK].") and analyzing their predictions. It leverages a dataset of such prompts designed to test specific factual and commonsense knowledge. The project provides connectors to various popular language model architectures, abstracting away model-specific APIs for consistent analysis.

Quick Start & Requirements

  • Install: pip install -r requirements.txt (after cloning and setting up a conda environment with Python 3.7).
  • Prerequisites: Requires downloading a ~55 GB models archive (download_models.sh), spaCy (python3 -m spacy download en), and the LAMA dataset (data.zip).
  • Links: Dataset, Models, BERT conversion.

Highlighted Details

  • Supports BERT, RoBERTa, ELMo, and Transformer-XL.
  • Includes scripts for generating contextual embeddings and filling masked tokens.
  • Offers functionality to create data for LAMA-UHN and Negated-LAMA evaluations.
  • Can be installed as an editable package (pip install -e git+https://github.com/facebookresearch/LAMA#egg=LAMA).

Maintenance & Community

  • Developed by Facebook AI Research.
  • References several key NLP papers and libraries, indicating community engagement.

Licensing & Compatibility

  • License: CC-BY-NC 4.0 (Creative Commons Attribution-NonCommercial 4.0 International).
  • Restrictions: Non-commercial use only.

Limitations & Caveats

  • The CC-BY-NC 4.0 license restricts commercial use.
  • Requires significant disk space (~55 GB) for models.
  • Setup involves downloading and unzipping large archives.
Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.