PyTorch implementation for Google's Gemma models
Top 9.3% on sourcepulse
This repository provides the official PyTorch implementation for Google's Gemma family of large language models, offering text-only and multimodal variants. It targets researchers and developers seeking to leverage state-of-the-art, lightweight models derived from Google's Gemini research, with support for inference across CPU, GPU, and TPU.
How It Works
The implementation utilizes PyTorch and PyTorch/XLA for efficient model execution. It supports various Gemma model sizes (1B to 27B parameters) and versions (v1.1, v2, v3, CodeGemma), with pre-trained and instruction-tuned checkpoints available on Kaggle and Hugging Face. The project includes scripts for running inference, with options for quantization.
Quick Start & Requirements
DOCKER_URI=gemma:${USER} docker build -f docker/Dockerfile ./ -t ${DOCKER_URI}
DOCKER_URI=gemma_xla:${USER} docker build -f docker/xla.Dockerfile ./ -t ${DOCKER_URI}
DOCKER_URI=gemma_xla_gpu:${USER} docker build -f docker/xla_gpu.Dockerfile ./ -t ${DOCKER_URI}
huggingface-cli
or Kaggle).Highlighted Details
Maintenance & Community
The project is actively updated with new Gemma versions. Model checkpoints are hosted on Kaggle and Hugging Face.
Licensing & Compatibility
The repository itself is not explicitly licensed in the README. Model weights are subject to the Gemma Terms of Use. Compatibility for commercial use or closed-source linking depends on the specific Gemma model license.
Limitations & Caveats
The README states this is "not an officially supported Google product." The tokenizer reserves 99 unused tokens for fine-tuning purposes.
2 months ago
Inactive