llm-numbers  by ray-project

LLM developer's reference for key numbers

created 2 years ago
4,248 stars

Top 11.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a curated list of essential numerical benchmarks and cost-performance ratios for Large Language Model (LLM) development. It aims to equip LLM developers with the data needed for efficient back-of-the-envelope calculations, guiding decisions on model selection, prompt engineering, and infrastructure choices to optimize cost and performance.

How It Works

The project compiles key figures related to LLM usage, such as token-to-word ratios, cost differentials between various OpenAI models (GPT-4 vs. GPT-3.5 Turbo, embeddings vs. generation), and self-hosting versus API costs. It also details GPU memory requirements for inference and provides estimates for training and fine-tuning costs, offering practical insights into the economics of LLM deployment.

Quick Start & Requirements

This repository is a documentation resource, not a software package. No installation or execution is required.

Highlighted Details

  • GPT-4 is approximately 50x more expensive than GPT-3.5 Turbo for inference.
  • Using vector stores for lookups is roughly 5x cheaper than querying GPT-3.5 Turbo.
  • Self-hosting embeddings can be ~10x cheaper than using OpenAI's embedding API.
  • Serving a fine-tuned model on OpenAI costs 6x more than serving a base model.

Maintenance & Community

Last updated May 17, 2023. Contributions are welcomed via GitHub issues or pull requests. The project is associated with Anyscale and the Ray ecosystem. Community discussions can be found on the Ray Slack (#LLM channel) and Discuss forum.

Licensing & Compatibility

The repository content is not explicitly licensed. The associated Ray project is Apache 2.0 licensed.

Limitations & Caveats

The numbers presented are based on specific dates and OpenAI's pricing at that time, and are subject to change. Some figures, like self-hosted embedding costs, are noted as sensitive to load and batch size. The cost to train a 13B parameter model is a highly idealized estimate.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
45 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.