awsdocsgpt  by antimetal

AI-powered search and chat for AWS documentation

created 2 years ago
277 stars

Top 94.5% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides an AI-powered search and chat interface for AWS documentation, targeting developers and users seeking quick, accurate answers from extensive technical resources. It leverages OpenAI embeddings and GPT-3.5-turbo to deliver a conversational experience, significantly improving information retrieval efficiency.

How It Works

The system utilizes OpenAI's text-embedding-ada-002 to generate vector embeddings for chunks of AWS documentation. User queries are also embedded, and cosine similarity is used to find relevant documentation pages. The chat functionality builds upon these search results by feeding them into GPT-3.5-turbo, enabling context-aware question answering.

Quick Start & Requirements

  • Install: Clone the repository, install frontend dependencies (npm i), and backend dependencies (pip install -r requirements.txt).
  • Prerequisites: OpenAI API key, PostgreSQL with pgvector extension, Node.js, Python 3.x.
  • Setup: Configure environment variables for API keys and database connection. Run setup.sql for database schema.
  • Data Ingestion: Use data/data-upload.py to parse AWS documentation URLs (listed in additional.txt) and upload chunks/embeddings to PostgreSQL. This process can take 30 minutes to several hours.
  • Run: Start the backend with uvicorn app.main:app --reload and the frontend with npm run dev.
  • Links: GitHub Repo

Highlighted Details

  • AI-powered search and chat for AWS documentation.
  • Utilizes OpenAI Embeddings (text-embedding-ada-002) and GPT-3.5-turbo.
  • PostgreSQL with pgvector for efficient similarity search.
  • Data ingestion script for custom documentation sources.

Maintenance & Community

The project is maintained by antimetal. Contact is available via Twitter.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

The data ingestion process can be time-consuming, and the project relies heavily on external OpenAI API services. The README does not specify the exact version of PostgreSQL or pgvector required, nor does it detail performance benchmarks.

Health Check
Last commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

chatgpt-pgvector by gannonh

0%
938
Domain-specific chat completions app
created 2 years ago
updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

LightRAG by HKUDS

1.0%
19k
RAG framework for fast, simple retrieval-augmented generation
created 10 months ago
updated 20 hours ago
Feedback? Help us improve.