LLM-As-Chatbot by deep-diver

Chatbot service for open-source instruction-following LLMs

Created 2 years ago

3,333 stars

Top 14.4% on SourcePulse

View on GitHub

7 Experts Love This Project

Gabriel Almeida

Cofounder of Langflow

Jeff Hammerbacher

Cofounder of Cloudera

Lysandre Debut

Chief Open-Source Officer at Hugging Face

Waseem AlShikh

Cofounder of Writer

and 3 more!

Project Summary

This repository provides a service to deploy and interact with various open-source instruction-following Large Language Models (LLMs) as chatbots. It targets users who want to easily experiment with different LLMs, offering both a Gradio-based web UI and a Discord bot interface, with integrated internet search capabilities.

How It Works

The project utilizes a model-agnostic conversation and context management library called "Ping Pong." This library abstracts away prompt formatting differences between various LLMs, allowing for seamless switching. The GradioChat UI provides a user-friendly interface similar to HuggingChat, while the Discord bot enables interaction through a popular messaging platform. Internet search is enabled via a Serper API key, integrating Google search results into chatbot responses.

Quick Start & Requirements

Gradio App:
- Install: pip install -r requirements.txt
- Run: python app.py --serper-api-key "YOUR SERPER API KEY"
- Prerequisites: Python >= 3.9, Gradio >= 3.32.0, Serper API Key.
- Docs: llmchat framework
Discord Bot:
- Install: pip install -r requirements.txt
- Run: python discord_app.py --token "DISCORD BOT TOKEN" --model-name "MODEL_NAME" --mode-[cpu|mps|8bit|4bit|full-gpu] --serper-api-key "YOUR SERPER API KEY"
- Prerequisites: Python >= 3.9, Discord Bot Token, Serper API Key.
- Docs: How to Create a Discord Bot Account

Highlighted Details

Supports a wide range of LLMs, including Alpaca-LoRA variants, StableLM, KoAlpaca, FLAN-Alpaca, Vicuna, MPT, Falcon, and Starcoder models.
Offers internet search integration via Serper API for real-time information retrieval.
Provides options for various inference modes (CPU, MPS, 8-bit, 4-bit, full GPU).
Can be deployed using dstack for cloud environments (AWS, GCP, Azure, Lambda Cloud).

Maintenance & Community

The project acknowledges generous GPU resources from Jarvislabs.ai and AI Network for development and fine-tuning.
Links to official documentation for Gradio app and dstack deployment are provided.

Licensing & Compatibility

The README does not explicitly state a license. Model weights are hosted on Hugging Face, subject to their respective licenses.

Limitations & Caveats

The project is primarily focused on instruction-following models and may not be suitable for all LLM use cases.
The "full-gpu" mode description contains a potential typo ("full means half").
Some model repositories mentioned in the list are marked for future privatization.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days