RWKV-Runner  by josStorer

RWKV tool automating LLM use, providing an OpenAI API-compatible interface

created 2 years ago
5,956 stars

Top 8.8% on sourcepulse

GitHubView on GitHub
Project Summary

RWKV Runner simplifies the use of RWKV large language models by providing a single, lightweight executable for management and deployment. It offers an OpenAI-compatible API, allowing any ChatGPT client to interact with RWKV models, and includes a user-friendly interface for chat, completion, and composition tasks.

How It Works

The project utilizes a front-end and back-end separation architecture. The back-end handles model inference, supporting custom CUDA kernels for acceleration and offering multi-level VRAM configurations for broad hardware compatibility. The front-end provides a WebUI and chat interface. The OpenAI-compatible API layer abstracts model interaction, enabling seamless integration with existing ChatGPT clients and tools.

Quick Start & Requirements

  • Install: Download pre-built executables for Windows, macOS, and Linux from the releases page.
  • Prerequisites: Custom CUDA kernel acceleration is enabled by default; disable in configs if compatibility issues arise. WebGPU strategy allows AMD/Intel GPU usage. MIDI hardware input requires specific setup for Bluetooth connections on Windows and macOS.
  • Setup: The project aims for minimal setup with a single executable.
  • Docs: FAQs, API Docs

Highlighted Details

  • OpenAI API compatibility allows use with any ChatGPT client.
  • Supports model management, download, and one-click LoRA finetuning (Windows only).
  • Includes a built-in WebUI for sharing hardware resources.
  • Offers MIDI hardware input and track editing capabilities.

Maintenance & Community

  • Active development with regular releases.
  • Links to related RWKV projects are provided.

Licensing & Compatibility

  • MIT License.
  • Permissive for commercial use and closed-source linking.

Limitations & Caveats

The default custom CUDA kernel may cause compatibility issues with some GPU drivers. Windows Defender may flag the executable as a virus, requiring manual exclusion. The max_tokens default is set high (102400), potentially leading to significant resource consumption without proper API gateway limits.

Health Check
Last commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
188 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Andre Zayarni Andre Zayarni(Cofounder of Qdrant), and
2 more.

RealChar by Shaunwei

0.1%
6k
Real-time AI character/companion creation and interaction codebase
created 2 years ago
updated 1 year ago
Feedback? Help us improve.