text-generation-webui  by oobabooga

Web UI for LLM text generation

created 2 years ago
44,511 stars

Top 0.6% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a comprehensive Gradio-based web UI for interacting with large language models, aiming to be the "AUTOMATIC1111 of text generation." It targets users who want a flexible, feature-rich interface for experimenting with and deploying LLMs, offering broad backend support and extensive customization.

How It Works

The UI supports multiple LLM backends including Transformers, llama.cpp, ExLlamaV3, and ExLlamaV2, allowing users to switch models and loaders seamlessly. It features automatic prompt formatting, three chat modes (instruct, chat-instruct, chat), and a "Past chats" menu for conversation management. Advanced users can leverage fine-grained control over sampling parameters and utilize an OpenAI-compatible API.

Quick Start & Requirements

  • Installation: Portable builds (unzip and run) or a one-click installer script (start_*.sh/.bat). Manual installation via Conda is also supported.
  • Prerequisites: Python 3.11+ recommended. NVIDIA GPUs require CUDA 12.4 for specific PyTorch builds. AMD GPUs require ROCm 6.1. CPU-only is supported.
  • Resources: Setup involves creating a Conda environment. GPU memory requirements depend on the models loaded.
  • Docs: Wiki

Highlighted Details

  • Supports Transformers, llama.cpp, ExLlamaV2, ExLlamaV3, and TensorRT-LLM backends.
  • Includes an OpenAI-compatible API with chat and completions endpoints.
  • Offers LoRA fine-tuning and extensive extension support.
  • Features automatic prompt formatting and multiple chat modes.

Maintenance & Community

The project is actively maintained. Community support is available via Reddit (r/Oobabooga). Andreessen Horowitz provided a grant in August 2023.

Licensing & Compatibility

The project is licensed under the Apache 2.0 license, permitting commercial use and linking with closed-source projects.

Limitations & Caveats

While the one-click installer simplifies setup, manual installation or troubleshooting specific backend requirements (e.g., CUDA versions, ROCm) may be necessary for optimal performance or compatibility. Some advanced features like TensorRT-LLM require separate Docker setups.

Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
22
Issues (30d)
32
Star History
1,309 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Didier Lopes Didier Lopes(Founder of OpenBB), and
10 more.

JARVIS by microsoft

0.1%
24k
System for LLM-orchestrated AI task automation
created 2 years ago
updated 4 days ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Pietro Schirano Pietro Schirano(Founder of MagicPath), and
1 more.

SillyTavern by SillyTavern

3.2%
17k
LLM frontend for power users
created 2 years ago
updated 3 days ago
Feedback? Help us improve.