PaLM  by conceptofmind

Open-source PaLM implementation for language model research

created 2 years ago
820 stars

Top 44.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides an open-source implementation of Google's PaLM models, targeting researchers and developers interested in large language models. It offers pre-trained models of various sizes (150M to 2.1B parameters) with 8k context length, enabling efficient inference and fine-tuning for custom applications.

How It Works

The implementation leverages several advanced techniques for performance and efficiency. It utilizes Flash Attention for faster and more memory-efficient attention mechanisms, Xpos Rotary Embeddings for improved length extrapolation, and multi-query single-key-value attention for more efficient decoding. The models are trained using decoupled weight decay Adam W, with an option for Stable Adam W, and distributed training scripts compatible with Accelerate and Slurm.

Quick Start & Requirements

  • Install: pip3 install -r requirements.txt
  • Prerequisites: PyTorch, Flash Attention (requires NVIDIA GPU, CUDA >= 11.0), accelerate, deepspeed. A100 GPU recommended for dtype inference.
  • Loading: Use torch.hub.load("conceptofmind/PaLM", "palm_410m_8k_v0").cuda() or load checkpoints directly.
  • Inference: python3 inference.py "Your prompt"
  • Docs: https://github.com/conceptofmind/PaLM

Highlighted Details

  • Four sizes (150M, 410M, 1B, 2.1B) trained on C4 with 8k context.
  • Models are compatible with Lucidrain's Toolformer-pytorch, PaLM-pytorch, and PaLM-rlhf-pytorch.
  • Inference uses torch.compile(), Flash Attention, and Hidet for performance.
  • Training was conducted on 64 A100 (80 GB) GPUs.

Maintenance & Community

The project acknowledges contributions from CarperAI, Stability.ai, and Phil Wang (Lucidrains). Huggingface integration is in progress.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is described as baseline versions with further training planned. Huggingface integration is a work-in-progress. Specific hardware requirements (A100 GPU) are mentioned for certain inference features.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

InternEvo by InternLM

1.0%
402
Lightweight training framework for model pre-training
created 1 year ago
updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.