tensorzero  by tensorzero

LLMOps framework for optimizing LLM applications via production data feedback

Created 1 year ago
10,300 stars

Top 4.9% on SourcePulse

GitHubView on GitHub
Project Summary

TensorZero is an open-source framework designed to create a feedback loop for optimizing Large Language Model (LLM) applications. It targets engineers and researchers building production-grade LLM systems, enabling them to leverage production data for smarter, faster, and cheaper models.

How It Works

TensorZero unifies several key components of the LLMOps lifecycle: an LLM gateway for accessing diverse models, an observability layer to capture inference metrics and feedback, an optimization engine for prompts and models (including fine-tuning and RL), and an evaluation framework for comparing different strategies. This integrated approach aims to create a compounding data and learning flywheel, allowing systems to improve over time based on real-world usage. The core gateway is built in Rust for low-latency performance.

Quick Start & Requirements

  • Install: pip install tensorzero
  • Prerequisites: ClickHouse database is required for observability. Supports integration with any OpenAI-compatible API.
  • Setup: Quick Start guide claims a 5-minute setup from a basic OpenAI wrapper to a production-ready application with observability and fine-tuning.
  • Links: Quick Start, Comprehensive Tutorial, Deployment Guide.

Highlighted Details

  • Unified gateway supports numerous LLM providers (Anthropic, AWS Bedrock, Azure OpenAI, Gemini, Mistral, vLLM, etc.) and OpenAI-compatible APIs.
  • Rust-based gateway boasts <1ms P99 latency overhead at 10k QPS.
  • Features include A/B testing, fallbacks, prompt templating, batch inference, multimodal support, and GitOps configuration.
  • Optimization capabilities extend to supervised fine-tuning (SFT), preference fine-tuning (DPO), and inference-time optimizations like Best-of-N sampling and Dynamic In-Context Learning (DICL).

Maintenance & Community

  • Backed by investors of prominent open-source projects and AI labs.
  • Team includes former Rust compiler maintainer and researchers from top universities.
  • Community channels available via Slack and Discord.

Licensing & Compatibility

  • The project is 100% open-source and self-hosted with no paid features. The specific license is not explicitly stated in the README, but the emphasis on self-hosting and no paid features suggests a permissive license suitable for commercial use.

Limitations & Caveats

  • While the gateway is written in Rust, the Python client and other integrations rely on Python. A ClickHouse database is a mandatory dependency for the observability features.
Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
306
Issues (30d)
141
Star History
609 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Pawel Garbacki Pawel Garbacki(Cofounder of Fireworks AI), and
3 more.

promptbench by microsoft

0.1%
3k
LLM evaluation framework
Created 2 years ago
Updated 1 month ago
Feedback? Help us improve.