Chat-with-Github-Repo  by peterw

CLI tool for chatbot creation using Streamlit, OpenAI, and Deep Lake

created 2 years ago
1,154 stars

Top 34.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides Python scripts to build a Git repository-aware chatbot using Streamlit, OpenAI GPT-3.5-turbo, and Activeloop's Deep Lake. It's designed for developers and researchers who want to quickly query and understand the content of any Git repository through a conversational interface.

How It Works

The solution comprises two core Python scripts. process.py clones a specified Git repository, extracts text content from specified file types, generates embeddings using OpenAIEmbeddings, and stores these embeddings in an Activeloop Deep Lake dataset. chat.py then builds a Streamlit web application that queries this Deep Lake dataset based on user input and leverages OpenAI GPT-3.5-turbo to generate contextually relevant answers. This approach allows for efficient semantic search over repository content.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Set environment variables: OPENAI_API_KEY, ACTIVELOOP_TOKEN, ACTIVELOOP_USERNAME (copy from .env.example).
  • Process a repo: python src/main.py process --repo-url <github_repo_url>
  • Run chat: python src/main.py chat --activeloop-dataset-name <dataset_name>
  • Requires OpenAI and Activeloop accounts and API keys.

Highlighted Details

  • Uses Activeloop Deep Lake for efficient vector storage and retrieval.
  • Leverages OpenAIEmbeddings for generating semantic representations.
  • Provides a Streamlit interface for user interaction.
  • Supports custom file extensions for repository processing.

Maintenance & Community

No specific information on contributors, sponsorships (beyond a general mention of "Exploding Insights"), or community channels is provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatible with commercial use and closed-source linking due to the permissive MIT license.

Limitations & Caveats

The project relies on external API keys for OpenAI and Activeloop, which incur costs. The effectiveness of the chatbot is dependent on the quality of embeddings generated by OpenAI and the structure of the repository's text content.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

chatgpt-pgvector by gannonh

0%
938
Domain-specific chat completions app
created 2 years ago
updated 2 years ago
Starred by Peter Norvig Peter Norvig(Author of Artificial Intelligence: A Modern Approach; Research Director at Google).

python-openai-demos by pamelafox

0%
374
Python scripts for OpenAI API demos
created 1 year ago
updated 1 week ago
Feedback? Help us improve.