searchGPT  by michaelthwan

Open-source RAG search engine for natural language answers

Created 2 years ago
701 stars

Top 48.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

searchGPT is an open-source RAG-based search engine designed to provide grounded, natural language answers to user queries. It targets users seeking factual information by integrating LLMs with real-time web search and local file content analysis, offering a more reliable alternative to ungrounded LLM responses.

How It Works

This project implements a Retrieval-Augmented Generation (RAG) architecture. It retrieves relevant information from web search results or local files (PPT, DOC, PDF) and then uses this context to inform LLM responses. This approach addresses the LLM's knowledge limitations by providing real-time, factual data, thereby reducing hallucinations and improving answer accuracy, as demonstrated by its comparison to ungrounded answers.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt (within a Python 3.10.8 environment).
  • Prerequisites: OpenAI API Key or GooseAI API Key, Azure Bing Search Subscription Key.
  • Setup: Configure API keys in backend/src/config/config.yaml. Run app.py for the web UI or main.py for command-line output.
  • Demo: https://searchgpt-demo.herokuapp.com/index

Highlighted Details

  • Supports web search with real-time results and file content search (PPT/DOC/PDF).
  • Integrates semantic search capabilities using FAISS or PyTerrier.
  • Compatible with OpenAI and GooseAI LLM providers.
  • Features an intuitive, easy-to-use frontend UI.

Maintenance & Community

The project welcomes contributions, particularly from frontend developers. Further details are available in their contributing guidelines.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The demo page has a loading time of approximately 10 seconds and requests users not to abuse it with automated programs.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Simon Willison Simon Willison(Coauthor of Django).

semantra by freedmand

0.1%
3k
CLI tool for semantic document search
Created 2 years ago
Updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Taranjeet Singh Taranjeet Singh(Cofounder of Mem0), and
8 more.

Perplexica by ItzCrazyKns

5.7%
25k
AI-powered search engine alternative
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.