codellama by meta-llama

Inference code for CodeLlama models

Created 2 years ago

16,362 stars

Top 3.0% on SourcePulse

View on GitHub

17 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Vincent Weisser

Cofounder of Prime Intellect

Yaowei Zheng

Author of LLaMA-Factory

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

and 13 more!

Project Summary

Code Llama provides state-of-the-art, open-source large language models specifically for code-related tasks. It offers foundation models, Python-specialized versions, and instruction-following variants across various parameter sizes (7B to 70B), targeting developers, researchers, and businesses seeking advanced code generation, infilling, and large context handling capabilities.

How It Works

Code Llama is built by fine-tuning the Llama 2 architecture on a large corpus of code. It supports long input contexts (up to 100k tokens) and features infilling capabilities for 7B and 13B models, allowing code completion based on surrounding context. The instruction-following models are fine-tuned with specific prompt formatting for better conversational and task-oriented performance.

Quick Start & Requirements

Installation: Clone the repository and run pip install -e . within a conda environment with PyTorch/CUDA.
Prerequisites: wget and md5sum for downloading weights. PyTorch with CUDA support is essential.
Model Weights: Download via a signed URL from the Meta website, requiring acceptance of their license. Links expire after 24 hours.
Hardware: Model sizes range from ~12.55GB (7B) to 131GB (70B). Inference requires specific model-parallel (MP) values: 1 for 7B, 2 for 13B, 4 for 34B, and 8 for 70B.

Highlighted Details

State-of-the-art performance among open code models.
Supports infilling for 7B and 13B models.
Handles input sequences up to 100,000 tokens.
Offers specialized Python and instruction-following variants.

Maintenance & Community

Developed by Meta AI. Issues can be reported on the GitHub repository. Feedback on risky content is available via a dedicated Facebook link.

Licensing & Compatibility

Models and weights are licensed for both research and commercial use, with an accompanying Acceptable Use Policy.

Limitations & Caveats

The models are trained on sequences of 16k tokens, with improvements noted on inputs up to 100k tokens. Output generated by Code Llama may be subject to third-party licenses. The README notes that Code Llama is a new technology with potential risks, and testing cannot cover all scenarios.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

16 stars in the last 30 days