GPTCache  by zilliztech

Semantic cache for LLM queries, integrated with LangChain and LlamaIndex

created 2 years ago
7,663 stars

Top 6.9% on sourcepulse

GitHubView on GitHub
Project Summary

GPTCache provides a semantic caching layer for Large Language Models (LLMs) to reduce API costs and improve response times. It's designed for developers building LLM-powered applications who face high operational expenses and latency issues. The library offers significant performance gains by storing and retrieving similar query results, effectively bypassing repeated LLM calls.

How It Works

GPTCache employs semantic caching, moving beyond simple exact-match retrieval. It converts user queries into embeddings using various embedding models and stores these in a vector database. When a new query arrives, GPTCache generates its embedding and performs a similarity search in the vector store to find semantically related past queries and their cached responses. This approach significantly increases cache hit rates compared to traditional methods.

Quick Start & Requirements

Highlighted Details

  • Integrates seamlessly with LangChain and LlamaIndex.
  • Supports a wide array of embedding models (ONNX, Hugging Face, SentenceTransformers, etc.).
  • Offers flexible cache storage options including SQLite, DuckDB, PostgreSQL, MySQL, and cloud-based solutions like Zilliz Cloud.
  • Compatible with numerous vector stores such as Milvus, FAISS, Chroma, and Qdrant.

Maintenance & Community

The project is actively developed by Zilliz. Contributions are welcomed, with a clear contribution guide available.

Licensing & Compatibility

GPTCache is released under the Apache-2.0 license, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The project is under "swift development," meaning APIs may change. Support for new LLM APIs and models is no longer being added directly; users are encouraged to use the generic get and set APIs. Some module combinations might not be compatible, and a sanity check feature is in development.

Health Check
Last commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)
2
Issues (30d)
0
Star History
154 stars in the last 90 days

Explore Similar Projects

Starred by Travis Fischer Travis Fischer(Founder of Agentic) and Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

semantic-cache by upstash

1.1%
281
Semantic cache for natural language tasks
created 1 year ago
updated 8 months ago
Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

chatgpt-pgvector by gannonh

0%
938
Domain-specific chat completions app
created 2 years ago
updated 2 years ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Anton Troynikov Anton Troynikov(Cofounder of Chroma), and
20 more.

llama_index by run-llama

0.3%
43k
Data framework for building LLM-powered agents
created 2 years ago
updated 18 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Tobi Lutke Tobi Lutke(Cofounder of Shopify), and
27 more.

vllm by vllm-project

1.0%
54k
LLM serving engine for high-throughput, memory-efficient inference
created 2 years ago
updated 13 hours ago
Feedback? Help us improve.