awsdocsgpt by antimetal

AI-powered search and chat for AWS documentation

Created 2 years ago

277 stars

Top 93.6% on SourcePulse

Project Summary

This project provides an AI-powered search and chat interface for AWS documentation, targeting developers and users seeking quick, accurate answers from extensive technical resources. It leverages OpenAI embeddings and GPT-3.5-turbo to deliver a conversational experience, significantly improving information retrieval efficiency.

How It Works

The system utilizes OpenAI's text-embedding-ada-002 to generate vector embeddings for chunks of AWS documentation. User queries are also embedded, and cosine similarity is used to find relevant documentation pages. The chat functionality builds upon these search results by feeding them into GPT-3.5-turbo, enabling context-aware question answering.

Quick Start & Requirements

Install: Clone the repository, install frontend dependencies (npm i), and backend dependencies (pip install -r requirements.txt).
Prerequisites: OpenAI API key, PostgreSQL with pgvector extension, Node.js, Python 3.x.
Setup: Configure environment variables for API keys and database connection. Run setup.sql for database schema.
Data Ingestion: Use data/data-upload.py to parse AWS documentation URLs (listed in additional.txt) and upload chunks/embeddings to PostgreSQL. This process can take 30 minutes to several hours.
Run: Start the backend with uvicorn app.main:app --reload and the frontend with npm run dev.
Links: GitHub Repo

Highlighted Details

AI-powered search and chat for AWS documentation.
Utilizes OpenAI Embeddings (text-embedding-ada-002) and GPT-3.5-turbo.
PostgreSQL with pgvector for efficient similarity search.
Data ingestion script for custom documentation sources.

Maintenance & Community

The project is maintained by antimetal. Contact is available via Twitter.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

The data ingestion process can be time-consuming, and the project relies heavily on external OpenAI API services. The README does not specify the exact version of PostgreSQL or pgvector required, nor does it detail performance benchmarks.

awsdocsgpt by antimetal

Explore Similar Projects

yacy_expert by yacy

wikipedia-semantic-search by upstash

wait-but-why-gpt by mckaywrigley

yt-fts by NotJoeMartinez

semantic-search-nextjs-pinecone-langchain-chatgpt by dabit3

chatgpt-pgvector by gannonh

DeepSeek-RAG-Chatbot by SaiAkhil066

typesense by typesense

search_with_lepton by leptonai

kotaemon by Cinnamon

meilisearch by meilisearch

azure-search-openai-demo by Azure-Samples