reversal_curse by lukasberglund

Code for research paper on failure of LLMs to learn bidirectional relationships

Created 2 years ago

300 stars

Top 88.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Ram Sriharsha

CTO of Pinecone

Project Summary

This repository provides code and datasets for investigating the "Reversal Curse" in Large Language Models (LLMs), where models trained on A=B relationships struggle to learn B=A. It's targeted at AI researchers and practitioners seeking to understand and mitigate this learning asymmetry in LLMs. The primary benefit is enabling reproducible research into a fundamental LLM limitation.

How It Works

The project implements three experiments: finetuning LLMs on identity reversals (e.g., "Daphne Barrington is the director..." vs. "The director of... is Daphne Barrington"), identifying real-world examples where LLMs exhibit this directional failure (e.g., celebrity parentage), and reversing instruction-following tasks. The approach involves generating synthetic datasets and using the OpenAI API for finetuning, allowing for controlled studies of the reversal curse phenomenon.

Quick Start & Requirements

Install: pip install -e .
Prerequisites: OpenAI API key set as OPENAI_API_KEY environment variable.
Links:
- Paper: https://arxiv.org/abs/2309.12288
- Datasets: https://huggingface.co/datasets/lberglund/reversal_curse

Highlighted Details

Code for generating synthetic datasets for identity and instruction reversal experiments.
Scripts for finetuning OpenAI models (e.g., ada) and monitoring runs via Weights & Biases.
Analysis of real-world celebrity relationships to identify existing reversal failures in models like GPT-4.
Evaluation scripts to assess model performance on reversed tasks.

Maintenance & Community

The project is associated with authors from the paper "The Reversal Curse: LLMs trained on A=B fail to learn B=A". Further community engagement details (e.g., Discord/Slack) are not specified in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification of the license.

reversal_curse by lukasberglund

Explore Similar Projects

LLM-for-misinformation-research by ICTMCG

LLMPapers by SEU-COIN

Raspberry by daveshap

OpenICL by Shark-NLP

LLM-Synthetic-Data by pengr

RGB by chen700564

discovering_latent_knowledge by collin-burns

langtest by Pacific-AI-Corp

autoevals by braintrustdata

prometheus-eval by prometheus-eval

math-lm by EleutherAI

llm-datasets by mlabonne