lex-gpt by rlancemartin

AI-powered search for the Lex Fridman podcast

Created 2 years ago

340 stars

Top 81.2% on SourcePulse

View on GitHub

4 Experts Love This Project

Jeff Hammerbacher

Cofounder of Cloudera

Suale Hasif

Cofounder of Cursor

Chris Van Pelt

Cofounder of Weights & Biases

Jeremy Howard

Cofounder of fast.ai

Project Summary

This application enables AI-powered search over the Lex Fridman podcast, targeting users interested in leveraging large language models for content discovery. It provides a practical demonstration of Langchain's capabilities for data ingestion, embedding, and question-answering.

How It Works

The project scrapes Lex Fridman podcast episodes, utilizing Whisper transcriptions for episodes 1-365. Transcribed data is then split and embedded using Langchain, with Pinecone serving as the vector database. A Langchain VectorDBQAChain handles user queries by embedding them, performing similarity searches on Pinecone, and synthesizing answers from relevant text chunks using GPT 3.5.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.x, OpenAI API key, Pinecone API key and environment.
Setup: Requires downloading transcriptions and setting up Pinecone.
Demo: https://lex-gpt.fly.dev/

Highlighted Details

Leverages Langchain for end-to-end RAG pipeline.
Utilizes Pinecone for efficient vector similarity search.
Builds upon an existing UI from mckaywrigley/wait-but-why-gpt.
Includes scripts for data scraping and processing.

Maintenance & Community

The project is maintained by rlancemartin. Contact is available via Twitter.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The project is presented as a testbed for Langchain functionality, implying potential for ongoing changes and instability. Streaming functionality is noted as requiring fly.io due to Vercel's edge function limitations, with ongoing work to resolve this.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days