SDK for building RAG and agent applications with PostgreSQL
Top 10.0% on sourcepulse
pgai is a Python library designed to transform PostgreSQL into a robust retrieval engine for RAG and agentic applications. It simplifies the creation and synchronization of vector embeddings from PostgreSQL data and S3 documents, automatically updating them as the underlying data changes. This empowers developers to build AI applications with semantic search and retrieval capabilities directly within their PostgreSQL databases.
How It Works
pgai employs a declarative approach where users define a vectorizer configuration specifying data sources, chunking strategies, and embedding models. Stateless worker processes then read this configuration, queue data for embedding, and write the resulting embeddings and text chunks back to PostgreSQL. This architecture decouples embedding generation from core data operations, enhancing resilience against embedding service failures. It leverages pgvector
for vector storage and search, and pgvectorscale
for high-performance ANN search.
Quick Start & Requirements
pip install pgai
.env
file with OPENAI_API_KEY
and DB_URL
.Highlighted Details
pgvector
and pgvectorscale
for semantic and ANN search.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is described as being in an "early stage," indicating potential for rapid changes and evolving features. While designed for production, users should be aware of the implications of adopting a rapidly developing library.
1 day ago
1 day