lex-gpt  by rlancemartin

AI-powered search for the Lex Fridman podcast

created 2 years ago
338 stars

Top 82.6% on sourcepulse

GitHubView on GitHub
Project Summary

This application enables AI-powered search over the Lex Fridman podcast, targeting users interested in leveraging large language models for content discovery. It provides a practical demonstration of Langchain's capabilities for data ingestion, embedding, and question-answering.

How It Works

The project scrapes Lex Fridman podcast episodes, utilizing Whisper transcriptions for episodes 1-365. Transcribed data is then split and embedded using Langchain, with Pinecone serving as the vector database. A Langchain VectorDBQAChain handles user queries by embedding them, performing similarity searches on Pinecone, and synthesizing answers from relevant text chunks using GPT 3.5.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python 3.x, OpenAI API key, Pinecone API key and environment.
  • Setup: Requires downloading transcriptions and setting up Pinecone.
  • Demo: https://lex-gpt.fly.dev/

Highlighted Details

  • Leverages Langchain for end-to-end RAG pipeline.
  • Utilizes Pinecone for efficient vector similarity search.
  • Builds upon an existing UI from mckaywrigley/wait-but-why-gpt.
  • Includes scripts for data scraping and processing.

Maintenance & Community

The project is maintained by rlancemartin. Contact is available via Twitter.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The project is presented as a testbed for Langchain functionality, implying potential for ongoing changes and instability. Streaming functionality is noted as requiring fly.io due to Vercel's edge function limitations, with ongoing work to resolve this.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
1 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.