awsdocsgpt  by antimetal

AI-powered search and chat for AWS documentation

Created 2 years ago
277 stars

Top 93.7% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an AI-powered search and chat interface for AWS documentation, targeting developers and users seeking quick, accurate answers from extensive technical resources. It leverages OpenAI embeddings and GPT-3.5-turbo to deliver a conversational experience, significantly improving information retrieval efficiency.

How It Works

The system utilizes OpenAI's text-embedding-ada-002 to generate vector embeddings for chunks of AWS documentation. User queries are also embedded, and cosine similarity is used to find relevant documentation pages. The chat functionality builds upon these search results by feeding them into GPT-3.5-turbo, enabling context-aware question answering.

Quick Start & Requirements

  • Install: Clone the repository, install frontend dependencies (npm i), and backend dependencies (pip install -r requirements.txt).
  • Prerequisites: OpenAI API key, PostgreSQL with pgvector extension, Node.js, Python 3.x.
  • Setup: Configure environment variables for API keys and database connection. Run setup.sql for database schema.
  • Data Ingestion: Use data/data-upload.py to parse AWS documentation URLs (listed in additional.txt) and upload chunks/embeddings to PostgreSQL. This process can take 30 minutes to several hours.
  • Run: Start the backend with uvicorn app.main:app --reload and the frontend with npm run dev.
  • Links: GitHub Repo

Highlighted Details

  • AI-powered search and chat for AWS documentation.
  • Utilizes OpenAI Embeddings (text-embedding-ada-002) and GPT-3.5-turbo.
  • PostgreSQL with pgvector for efficient similarity search.
  • Data ingestion script for custom documentation sources.

Maintenance & Community

The project is maintained by antimetal. Contact is available via Twitter.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

The data ingestion process can be time-consuming, and the project relies heavily on external OpenAI API services. The README does not specify the exact version of PostgreSQL or pgvector required, nor does it detail performance benchmarks.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP AI at Vercel; Founder of Turborepo; Author of Formik, TSDX) and Andrew Kane Andrew Kane(Author of pgvector).

chatgpt-pgvector by gannonh

0%
938
Domain-specific chat completions app
Created 2 years ago
Updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Zack Li Zack Li(Cofounder of Nexa AI), and
12 more.

search_with_lepton by leptonai

0.0%
8k
Conversational search engine demo
Created 1 year ago
Updated 2 weeks ago
Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Simon Horup Eskildsen Simon Horup Eskildsen(Cofounder of Turbopuffer), and
21 more.

meilisearch by meilisearch

0.2%
53k
Search engine API for integrating AI-powered hybrid search
Created 7 years ago
Updated 1 day ago
Feedback? Help us improve.