Discover and explore top open-source AI tools and projects—updated daily.
meta-llama*Deprecated* minimal example for loading and running Llama 3 models
Top 1.3% on SourcePulse
This repository provides a minimal example for loading and running inference with Meta's Llama 3 large language models, available in 8B and 70B parameter sizes. It targets developers, researchers, and businesses seeking to integrate advanced LLM capabilities into their applications, offering a starting point for experimentation and scaling.
How It Works
The project leverages PyTorch and CUDA for efficient inference. It provides scripts for loading pre-trained and instruction-tuned models, with specific formatting requirements for chat-based interactions. The architecture supports model parallelism (MP) for scaling inference across multiple GPUs, with MP values of 1 for 8B models and 8 for 70B models.
Quick Start & Requirements
pip install -e . within a Conda environment with PyTorch/CUDA.huggingface-cli download meta-llama/Meta-Llama-3-8B-Instruct --include "original/*").wget, md5sum, PyTorch, CUDA.torchrun --nproc_per_node <MP_value> example_chat_completion.py --ckpt_dir <model_path> --tokenizer_path <tokenizer_path> ...Highlighted Details
Maintenance & Community
This repository is marked as deprecated, with functionality migrated to several new repositories: llama-models, PurpleLlama, llama-toolchain, llama-agentic-system, and llama-cookbook. Users are directed to these new repos for ongoing development and support.
Licensing & Compatibility
The models and weights are licensed for research and commercial entities, with an accompanying Acceptable Use Policy.
Limitations & Caveats
This repository is deprecated and serves only as a minimal example. All active development and support have moved to newer, specialized repositories within the Llama Stack.
9 months ago
1 week
open-compass
jzhang38
meta-llama
tloen
tatsu-lab
meta-llama