chatllama  by henrywoo

Open-source implementation for LLaMA-based ChatGPT, runnable on a single GPU

Created 2 years ago
1,205 stars

Top 32.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

ChatLLaMA provides an open-source implementation for fine-tuning Meta's LLaMA models into ChatGPT-like conversational agents using Reinforcement Learning from Human Feedback (RLHF). It targets researchers and developers aiming to build cost-effective, single-GPU deployable chatbots with faster training than original ChatGPT.

How It Works

ChatLLaMA implements the RLHF training pipeline for LLaMA models. It leverages DeepSpeed ZERO for efficient, distributed fine-tuning, enabling faster training on smaller hardware. The approach supports all LLaMA model sizes (7B to 65B), allowing users to balance training time and inference performance.

Quick Start & Requirements

  • Install via pip: pip install chatllama
  • Requires Meta's LLaMA model weights (apply via Meta's form).
  • Requires a custom dataset or generation via provided scripts.
  • Supports all LLaMA architectures (7B, 13B, 33B, 65B).
  • Official documentation and examples are available.

Highlighted Details

  • Claims a 15x faster training process compared to ChatGPT.
  • Enables single-GPU inference for LLaMA models.
  • Built-in support for DeepSpeed ZERO for accelerated fine-tuning.
  • Compatible with all LLaMA model sizes.

Maintenance & Community

  • Project appears to be a personal or small-team effort.
  • No explicit links to community channels (Discord, Slack) or roadmaps are provided in the README.

Licensing & Compatibility

  • The README does not explicitly state a license for the chatllama library itself.
  • Compatibility with Meta's LLaMA model weights is subject to Meta's terms of use.

Limitations & Caveats

The repository does not include model weights, requiring users to obtain them separately from Meta. The README implies a focus on the algorithmic implementation of RLHF rather than a fully packaged, ready-to-deploy solution.

Health Check
Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Jiaming Song Jiaming Song(Chief Scientist at Luma AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

LLaMA-Adapter by OpenGVLab

0.1%
6k
Efficient fine-tuning for instruction-following LLaMA models
Created 2 years ago
Updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and
20 more.

TinyLlama by jzhang38

0.1%
9k
Tiny pretraining project for a 1.1B Llama model
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.