pytorch-llama  by hkproj

PyTorch implementation of the LLaMA 2 architecture

created 1 year ago
342 stars

Top 81.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a from-scratch implementation of Meta's LLaMA 2 large language model using PyTorch. It is intended for researchers and engineers who need a deep understanding of LLM architectures and wish to experiment with custom modifications or integrations without relying on pre-built libraries.

How It Works

The project meticulously reconstructs the LLaMA 2 architecture, including its transformer blocks, attention mechanisms (grouped-query attention), and normalization layers, entirely within PyTorch. This approach allows for granular control over the model's components and facilitates direct experimentation with architectural variations.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python 3.8+, PyTorch 2.0+, Transformers, NumPy, SentencePiece. GPU with CUDA support is highly recommended for practical use.
  • Setup: Requires downloading LLaMA 2 model weights separately.
  • Docs: https://github.com/hkproj/pytorch-llama

Highlighted Details

  • Full implementation of LLaMA 2 architecture.
  • Includes Grouped-Query Attention (GQA) for improved inference efficiency.
  • Supports model parallelism for training/inference on multiple GPUs.
  • Provides example scripts for inference and basic fine-tuning.

Maintenance & Community

The project is maintained by hkproj. Community engagement channels are not explicitly listed in the README.

Licensing & Compatibility

The repository itself appears to be under the MIT License. However, the use of LLaMA 2 weights is subject to Meta's own license terms, which may have restrictions on commercial use.

Limitations & Caveats

This is a foundational implementation and may lack the optimizations and features found in more mature libraries like Hugging Face Transformers. Training from scratch requires significant computational resources and expertise.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
25 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.