finetune-transformer-lm  by openai

Code for generative pre-training research paper

created 7 years ago
2,222 stars

Top 20.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the code and models for the paper "Improving Language Understanding by Generative Pre-Training." It is intended for researchers and practitioners interested in generative pre-training for language understanding tasks, specifically demonstrating results on the ROCStories dataset.

How It Works

The project implements a generative pre-training approach for language models. The core idea is to leverage a transformer architecture and train it on a large corpus to learn general language understanding capabilities, which can then be fine-tuned for specific downstream tasks. The provided code focuses on reproducing the ROCStories Cloze Test results.

Quick Start & Requirements

  • Primary install / run command: python train.py --dataset rocstories --desc rocstories --submit --analysis --data_dir [path to data here]
  • Non-default prerequisites and dependencies: ROCStories dataset (downloadable from its associated website).
  • Links to official quick-start, docs, demo, or other relevant pages: ROCStories dataset website (implied by README)

Highlighted Details

  • Achieves a median accuracy of 85.8% on the ROCStories Cloze Test with 10 runs, slightly lower than the paper's reported 86.5%.
  • The code is currently non-deterministic due to GPU operations.

Maintenance & Community

  • Status: Archive (code is provided as-is, no updates expected).

Licensing & Compatibility

  • License: Not specified in the README.
  • Compatibility notes: No information provided.

Limitations & Caveats

The project is archived and will not receive updates. The code's non-deterministic nature due to GPU operations may affect reproducibility.

Health Check
Last commit

6 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
21 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Alex Cheema Alex Cheema(Cofounder of EXO Labs), and
1 more.

recurrent-pretraining by seal-rg

0.1%
806
Pretraining code for depth-recurrent language model research
created 5 months ago
updated 2 weeks ago
Feedback? Help us improve.