LaMDA-rlhf-pytorch by conceptofmind

PyTorch pre-training for LaMDA research paper

Created 3 years ago

470 stars

Top 64.7% on SourcePulse

View on GitHub

2 Experts Love This Project

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

This repository provides an open-source PyTorch implementation of Google's LaMDA architecture, focusing on a 2B parameter model suitable for researchers and developers interested in replicating or extending large language models for dialog applications. It aims to incorporate Reinforcement Learning from Human Feedback (RLHF) similar to ChatGPT.

How It Works

The implementation follows a GPT-like decoder-only architecture, utilizing T5's relative positional bias in attention and Gated GELU activation in the feed-forward layers. It employs a Sentencepiece byte-pair encoded tokenizer for efficient text processing. The model is designed for autoregressive generation with Top-k sampling.

Quick Start & Requirements

Install: pip install is planned but not yet available.
Prerequisites: PyTorch, Huggingface datasets, Weights and Biases for logging. ColossalAI integration for scaling is in progress.
Resources: Training at scale requires pipeline parallelism with ZeRO 1.
Links: Google LaMDA Blog Post 2022, Google LaMDA Blog Post 2021

Highlighted Details

Implements a 2B parameter version of LaMDA.
Integrates logging via Weights and Biases.
Planned integration with ColossalAI for distributed training.
Aims to add RLHF capabilities.

Maintenance & Community

The project is authored by Enrico Shippole, with updates available on Twitter and LinkedIn. Key TODO items include Sentencepiece tokenizer training/integration, detailed documentation, finetuning scripts, and a PyPI installer.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is under active development with several planned features (Sentencepiece, finetuning, pip installer) yet to be implemented. Official documentation is also pending. The absence of an explicit license may pose restrictions on commercial use.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days