LLM developer's reference for key numbers
Top 11.7% on sourcepulse
This repository provides a curated list of essential numerical benchmarks and cost-performance ratios for Large Language Model (LLM) development. It aims to equip LLM developers with the data needed for efficient back-of-the-envelope calculations, guiding decisions on model selection, prompt engineering, and infrastructure choices to optimize cost and performance.
How It Works
The project compiles key figures related to LLM usage, such as token-to-word ratios, cost differentials between various OpenAI models (GPT-4 vs. GPT-3.5 Turbo, embeddings vs. generation), and self-hosting versus API costs. It also details GPU memory requirements for inference and provides estimates for training and fine-tuning costs, offering practical insights into the economics of LLM deployment.
Quick Start & Requirements
This repository is a documentation resource, not a software package. No installation or execution is required.
Highlighted Details
Maintenance & Community
Last updated May 17, 2023. Contributions are welcomed via GitHub issues or pull requests. The project is associated with Anyscale and the Ray ecosystem. Community discussions can be found on the Ray Slack (#LLM channel) and Discuss forum.
Licensing & Compatibility
The repository content is not explicitly licensed. The associated Ray project is Apache 2.0 licensed.
Limitations & Caveats
The numbers presented are based on specific dates and OpenAI's pricing at that time, and are subject to change. Some figures, like self-hosted embedding costs, are noted as sensitive to load and batch size. The cost to train a 13B parameter model is a highly idealized estimate.
1 year ago
1 day