ChatRWKV by BlinkDL

Open-source chatbot powered by the RWKV RNN language model

Created 3 years ago

9,512 stars

Top 5.3% on SourcePulse

View on GitHub

9 Experts Love This Project

Nat Friedman

Former CEO of GitHub

Alex Yu

Research Scientist at OpenAI; Cofounder of Luma AI

and 5 more!

Project Summary

ChatRWKV provides a ChatGPT-like conversational AI experience powered by the RWKV (Receptance Weighted Key Value) language model. It targets developers and researchers seeking an open-source alternative to Transformer-based models, offering comparable quality and scalability with improved speed and reduced VRAM usage due to its RNN architecture.

How It Works

ChatRWKV leverages the RWKV architecture, a novel RNN that achieves Transformer-level performance and scalability. Unlike traditional RNNs, RWKV processes sequences in a time-mixing manner, allowing it to capture long-range dependencies effectively. This RNN design inherently supports stateful inference, enabling efficient sequential processing and significant VRAM savings compared to Transformers, which require recomputing attention over the entire sequence.

Quick Start & Requirements

Install: pip install rwkv
Prerequisites: Python, PyTorch. For CUDA acceleration, CUDA Toolkit and ninja are required. Building CUDA kernels involves setting environment variables (RWKV_CUDA_ON=1, PATH, LD_LIBRARY_PATH) and potentially reinstalling CUDA with VC++ extensions for Windows.
Resources: VRAM requirements vary by model size; 3GB VRAM is cited for running a 14B parameter model with specific optimizations.
Demos & Docs: API_DEMO_CHAT.py, RWKV Homepage, RWKV-LM Repo.

Highlighted Details

RWKV-6 architecture (arXiv:2404.05892) offers state-of-the-art performance.
Supports various inference strategies (e.g., cuda fp16, INT8) for optimized speed and VRAM usage.
Community-driven projects offer Vulkan, cuBLAS, CLBlast, and ggml-based inference for broad hardware support.
Fine-tuning methods like LoRA and QLoRA are supported via community projects.

Maintenance & Community

Active community with 7k+ members on Discord (https://discord.gg/bDSBUMeFpc).
Project lead: BlinkDL.
Training sponsored by Stability AI and EleutherAI.

Licensing & Compatibility

The primary license is not explicitly stated in the README, but the underlying RWKV model weights are generally available under permissive licenses (e.g., Apache 2.0 for some versions). Compatibility for commercial use should be verified with specific model weights.

Limitations & Caveats

Building CUDA kernels requires specific build tools and environment configurations, potentially leading to setup complexity.
While aiming for ChatGPT parity, performance can vary significantly based on model size, quantization, and inference strategy.
The README mentions "Raven"-series models are "almost ChatGPT-like," implying potential differences in capability.

Health Check

Last Commit

4 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

11 stars in the last 30 days