lit-llama by Lightning-AI

LLaMA implementation for pretraining, finetuning, and inference

Created 2 years ago

6,092 stars

Top 8.3% on SourcePulse

10 Experts Love This Project

karpathy

Andrej Karpathy

Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n

pgarbacki

Cofounder of Fireworks AI

codekansas

Cofounder of K-Scale Labs

Edward-Sun

Research Scientist at Meta Superintelligence Lab

and 6 more!

Project Summary

Lit-LLaMA provides an independent, Apache 2.0 licensed implementation of the LLaMA language model, built on nanoGPT. It targets researchers and developers seeking to use, fine-tune, or pre-train LLaMA-compatible models without the GPL restrictions of the original Meta implementation, enabling broader integration and open-source AI development.

How It Works

This project offers a simplified, single-file implementation of LLaMA, prioritizing correctness and optimization for consumer hardware and scalable deployments. It leverages techniques like flash attention, INT8 and GPTQ 4-bit quantization for reduced memory footprint, and parameter-efficient fine-tuning methods such as LoRA and LLaMA-Adapter.

Quick Start & Requirements

Install: pip install -e ".[all]"
Requirements: Python, PyTorch. GPU with sufficient VRAM (e.g., ~26GB for 7B model without quantization, ~5GB with GPTQ int4). CUDA support is implied for GPU acceleration.
Resources: Guide for generating samples: https://github.com/Lightning-AI/lit-llama/blob/main/README.md#generate-text-from-the-model
Fine-tuning: Requires ~24GB GPU memory.

Highlighted Details

Supports flash attention, INT8, and GPTQ 4-bit quantization.
Enables LoRA and LLaMA-Adapter fine-tuning.
Includes scripts for pre-training on datasets like RedPajama.
Designed for numerical equivalence to the original LLaMA model.

Maintenance & Community

Warning: This repository is no longer actively maintained. The successor project is LitGPT.
Community: Discord server available for engagement. https://discord.gg/VptPCZkGNa

Licensing & Compatibility

License: Apache 2.0.
Compatibility: Permissive license allows integration with closed-source projects.

Limitations & Caveats

This repository is explicitly marked as "Not Actively Maintained." Users are directed to the LitGPT project for updated features and support.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

6 stars in the last 30 days

Explore Similar Projects

slowllama by okuvshynov

LoRA finetuning for large language models on limited-memory devices

Created 2 years ago

Updated 1 year ago

Starred by

Philipp Schmid

Philipp Schmid(DevRel at Google DeepMind).

llm_qlora by georgesung

Fine-tuning tool for LLMs using QLoRA

Created 2 years ago

Updated 1 year ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera) and

Benjamin Bolte

Benjamin Bolte(Cofounder of K-Scale Labs).

llama2.rs by srush

Rust library for fast Llama2 inference on CPU

Created 2 years ago

Updated 2 years ago

Starred by

Lewis Tunstall

Lewis Tunstall(Research Engineer at Hugging Face) and

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

huggingface-llama-recipes by huggingface

Recipes for Llama 3 models

Created 1 year ago

Updated 8 months ago

Starred by

David Phillips

David Phillips(Author of Trino, Presto).

llama3.java by mukel

Java library for Llama 3 inference

Created 1 year ago

Updated 11 months ago

LLM-Tuning by beyondguo

SDK for LLM tuning and Sample Design Engineering (SDE)

Created 2 years ago

Updated 1 year ago

Starred by

Beyang Liu

Beyang Liu(Cofounder of Sourcegraph),

Shyamal Anadkat

Shyamal Anadkat(Research Scientist at OpenAI), and

3 more.

llm-finetuning by modal-labs

LLM fine-tuning guide using Modal and Axolotl

Created 2 years ago

Updated 2 months ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

6 more.

xTuring by stochasticai

SDK for fine-tuning and customizing open-source LLMs

Created 2 years ago

Updated 1 week ago

Chinese-Vicuna by Facico

Chinese LLaMA fine-tuning project for instruction-following

Created 2 years ago

Updated 8 months ago

Starred by

Junyang Lin

Junyang Lin(Core Maintainer at Alibaba Qwen).

Firefly by yangjianxin1

LLM training tool for Qwen2.5, Llama3, Gemma, and other models

Created 2 years ago

Updated 1 year ago

Starred by

Eric Zhu

Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research),

Eugene Yan

Eugene Yan(AI Scientist at AWS), and

1 more.

ms-swift by modelscope

SDK for fine-tuning and deploying LLMs/MLLMs

Created 2 years ago

Updated 1 day ago

Starred by

Junyang Lin

Junyang Lin(Core Maintainer at Alibaba Qwen),

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect), and

25 more.

alpaca-lora by tloen

LoRA fine-tuning for LLaMA

Created 2 years ago

Updated 1 year ago

Feedback? Help us improve.