GPTCache by zilliztech

Semantic cache for LLM queries, integrated with LangChain and LlamaIndex

Created 2 years ago

7,903 stars

Top 6.5% on SourcePulse

View on GitHub

13 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Jeff Hammerbacher

Cofounder of Cloudera

Ying Sheng

Coauthor of SGLang

and 9 more!

Project Summary

GPTCache provides a semantic caching layer for Large Language Models (LLMs) to reduce API costs and improve response times. It's designed for developers building LLM-powered applications who face high operational expenses and latency issues. The library offers significant performance gains by storing and retrieving similar query results, effectively bypassing repeated LLM calls.

How It Works

GPTCache employs semantic caching, moving beyond simple exact-match retrieval. It converts user queries into embeddings using various embedding models and stores these in a vector database. When a new query arrives, GPTCache generates its embedding and performs a similarity search in the vector store to find semantically related past queries and their cached responses. This approach significantly increases cache hit rates compared to traditional methods.

Quick Start & Requirements

Install via pip: pip install gptcache
Requires Python 3.8.1 or higher.
For development, clone the repo and install requirements: git clone -b dev https://github.com/zilliztech/GPTCache.git && cd GPTCache && pip install -r requirements.txt && python setup.py install
OpenAI API key is required for OpenAI model integration.
Official Docs: https://gptcache.readthedocs.io/en/latest/
Examples: https://github.com/zilliztech/GPTCache/tree/main/examples

Highlighted Details

Integrates seamlessly with LangChain and LlamaIndex.
Supports a wide array of embedding models (ONNX, Hugging Face, SentenceTransformers, etc.).
Offers flexible cache storage options including SQLite, DuckDB, PostgreSQL, MySQL, and cloud-based solutions like Zilliz Cloud.
Compatible with numerous vector stores such as Milvus, FAISS, Chroma, and Qdrant.

Maintenance & Community

The project is actively developed by Zilliz. Contributions are welcomed, with a clear contribution guide available.

Licensing & Compatibility

GPTCache is released under the Apache-2.0 license, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The project is under "swift development," meaning APIs may change. Support for new LLM APIs and models is no longer being added directly; users are encouraged to use the generic get and set APIs. Some module combinations might not be compatible, and a sanity check feature is in development.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

43 stars in the last 30 days