text-generation-webui by oobabooga

Web UI for LLM text generation

Created 3 years ago

45,826 stars

Top 0.6% on SourcePulse

View on GitHub

29 Experts Love This Project

Vincent Weisser

Cofounder of Prime Intellect

and 25 more!

Project Summary

This project provides a comprehensive Gradio-based web UI for interacting with large language models, aiming to be the "AUTOMATIC1111 of text generation." It targets users who want a flexible, feature-rich interface for experimenting with and deploying LLMs, offering broad backend support and extensive customization.

How It Works

The UI supports multiple LLM backends including Transformers, llama.cpp, ExLlamaV3, and ExLlamaV2, allowing users to switch models and loaders seamlessly. It features automatic prompt formatting, three chat modes (instruct, chat-instruct, chat), and a "Past chats" menu for conversation management. Advanced users can leverage fine-grained control over sampling parameters and utilize an OpenAI-compatible API.

Quick Start & Requirements

Installation: Portable builds (unzip and run) or a one-click installer script (start_*.sh/.bat). Manual installation via Conda is also supported.
Prerequisites: Python 3.11+ recommended. NVIDIA GPUs require CUDA 12.4 for specific PyTorch builds. AMD GPUs require ROCm 6.1. CPU-only is supported.
Resources: Setup involves creating a Conda environment. GPU memory requirements depend on the models loaded.
Docs: Wiki

Highlighted Details

Supports Transformers, llama.cpp, ExLlamaV2, ExLlamaV3, and TensorRT-LLM backends.
Includes an OpenAI-compatible API with chat and completions endpoints.
Offers LoRA fine-tuning and extensive extension support.
Features automatic prompt formatting and multiple chat modes.

Maintenance & Community

The project is actively maintained. Community support is available via Reddit (r/Oobabooga). Andreessen Horowitz provided a grant in August 2023.

Licensing & Compatibility

The project is licensed under the Apache 2.0 license, permitting commercial use and linking with closed-source projects.

Limitations & Caveats

While the one-click installer simplifies setup, manual installation or troubleshooting specific backend requirements (e.g., CUDA versions, ROCm) may be necessary for optimal performance or compatibility. Some advanced features like TensorRT-LLM require separate Docker setups.

Health Check

Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

311 stars in the last 30 days