gemma_pytorch  by google

PyTorch implementation for Google's Gemma models

created 1 year ago
5,515 stars

Top 9.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official PyTorch implementation for Google's Gemma family of large language models, offering text-only and multimodal variants. It targets researchers and developers seeking to leverage state-of-the-art, lightweight models derived from Google's Gemini research, with support for inference across CPU, GPU, and TPU.

How It Works

The implementation utilizes PyTorch and PyTorch/XLA for efficient model execution. It supports various Gemma model sizes (1B to 27B parameters) and versions (v1.1, v2, v3, CodeGemma), with pre-trained and instruction-tuned checkpoints available on Kaggle and Hugging Face. The project includes scripts for running inference, with options for quantization.

Quick Start & Requirements

  • Installation: Docker is the primary method for running inference.
    • Build PyTorch image: DOCKER_URI=gemma:${USER} docker build -f docker/Dockerfile ./ -t ${DOCKER_URI}
    • Build PyTorch/XLA image (CPU/TPU): DOCKER_URI=gemma_xla:${USER} docker build -f docker/xla.Dockerfile ./ -t ${DOCKER_URI}
    • Build PyTorch/XLA image (GPU): DOCKER_URI=gemma_xla_gpu:${USER} docker build -f docker/xla_gpu.Dockerfile ./ -t ${DOCKER_URI}
  • Prerequisites: Docker, model checkpoints (downloadable via huggingface-cli or Kaggle).
  • Resources: Requires significant disk space for model checkpoints. GPU/TPU recommended for performance.
  • Documentation: Gemma on Google AI, Colab Notebook

Highlighted Details

  • Supports Gemma v1.1, v2, v3, and CodeGemma models.
  • Inference available for CPU, GPU, and TPU via PyTorch and PyTorch/XLA.
  • Includes multimodal model variants.
  • Offers int8 quantization for reduced memory footprint.

Maintenance & Community

The project is actively updated with new Gemma versions. Model checkpoints are hosted on Kaggle and Hugging Face.

Licensing & Compatibility

The repository itself is not explicitly licensed in the README. Model weights are subject to the Gemma Terms of Use. Compatibility for commercial use or closed-source linking depends on the specific Gemma model license.

Limitations & Caveats

The README states this is "not an officially supported Google product." The tokenizer reserves 99 unused tokens for fine-tuning purposes.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
115 stars in the last 90 days

Explore Similar Projects

Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
1 more.

local-gemma by huggingface

0%
375
CLI tool for local Gemma-2 inference
created 1 year ago
updated 1 year ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

ctransformers by marella

0.1%
2k
Python bindings for fast Transformer model inference
created 2 years ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and
10 more.

TinyLlama by jzhang38

0.3%
9k
Tiny pretraining project for a 1.1B Llama model
created 1 year ago
updated 1 year ago
Feedback? Help us improve.