finetune-transformer-lm by openai

Code for generative pre-training research paper

Created 7 years ago

2,267 stars

Top 19.8% on SourcePulse

8 Experts Love This Project

aravindsrinivas

Aravind Srinivas

Cofounder of Perplexity

0hq

Coauthor of Sora

shizhediao

Author of LMFlow; Research Scientist at NVIDIA

generall

Andrey Vasnetsov

Cofounder of Qdrant

and 4 more!

Project Summary

This repository provides the code and models for the paper "Improving Language Understanding by Generative Pre-Training." It is intended for researchers and practitioners interested in generative pre-training for language understanding tasks, specifically demonstrating results on the ROCStories dataset.

How It Works

The project implements a generative pre-training approach for language models. The core idea is to leverage a transformer architecture and train it on a large corpus to learn general language understanding capabilities, which can then be fine-tuned for specific downstream tasks. The provided code focuses on reproducing the ROCStories Cloze Test results.

Quick Start & Requirements

Primary install / run command: python train.py --dataset rocstories --desc rocstories --submit --analysis --data_dir [path to data here]
Non-default prerequisites and dependencies: ROCStories dataset (downloadable from its associated website).
Links to official quick-start, docs, demo, or other relevant pages: ROCStories dataset website (implied by README)

Highlighted Details

Achieves a median accuracy of 85.8% on the ROCStories Cloze Test with 10 runs, slightly lower than the paper's reported 86.5%.
The code is currently non-deterministic due to GPU operations.

Maintenance & Community

Status: Archive (code is provided as-is, no updates expected).

Licensing & Compatibility

License: Not specified in the README.
Compatibility notes: No information provided.

Limitations & Caveats

The project is archived and will not receive updates. The code's non-deterministic nature due to GPU operations may affect reproducibility.

Health Check

Last Commit

7 years ago

Responsiveness

1 week

Pull Requests (30d)

0

Issues (30d)

0

Star History

5 stars in the last 30 days

Explore Similar Projects

Starred by

Maxime Labonne

Maxime Labonne(Head of Post-Training at Liquid AI) and

Wing Lian

Wing Lian(Founder of Axolotl AI).

rho by microsoft

LLM pretraining research paper using selective language modeling (SLM)

Created 1 year ago

Updated 1 year ago

Starred by

Jeremy Howard

Jeremy Howard(Cofounder of fast.ai).

code-eval by abacaj

Evaluation harness for LLMs using the HumanEval benchmark

Created 2 years ago

Updated 2 years ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory).

MetaMath by meta-math

Math question generation for LLM training and evaluation

Created 2 years ago

Updated 1 year ago

OpenLTM by thuml

Open codebase for large time-series model development and evaluation

Created 1 year ago

Updated 5 months ago

segformer-pytorch by bubbliiiing

PyTorch code for SegFormer semantic segmentation

Created 3 years ago

Updated 2 years ago

PandaLM by WeOpenML

LLM evaluation benchmark for reproducible, automated assessment

Created 2 years ago

Updated 1 year ago

Starred by

Simon Willison

Simon Willison(Coauthor of Django),

Johannes Hagemann

Johannes Hagemann(Cofounder of Prime Intellect), and

4 more.

OpenMoE by XueFuzhao

Open-source MoE LLM for research

Created 2 years ago

Updated 1 year ago

open-unlearning by locuslab

LLM unlearning framework for unifying evaluation benchmarks

Created 2 years ago

Updated 2 weeks ago

Starred by

Aravind Srinivas

Aravind Srinivas(Cofounder of Perplexity),

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI), and

4 more.

iaf by openai

Code for reproducing research paper results

Created 9 years ago

Updated 7 years ago

Starred by

Johannes Hagemann

Johannes Hagemann(Cofounder of Prime Intellect),

Li Jiang

Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and

4 more.

mle-bench by openai

Benchmark for evaluating AI agents on machine learning engineering tasks

Created 1 year ago

Updated 3 weeks ago

Starred by

Théophile Gervet

Théophile Gervet(Cofounder of Genesis AI),

Jason Knight

Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and

7 more.

lingua by facebookresearch

LLM research codebase for training and inference

Created 1 year ago

Updated 5 months ago

Starred by

Clement Delangue

Clement Delangue(Cofounder of Hugging Face),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

12 more.

evaluate by huggingface

ML model evaluation library for standardized performance reporting

Created 3 years ago

Updated 1 month ago

Feedback? Help us improve.