ChatRWKV  by BlinkDL

Open-source chatbot powered by the RWKV RNN language model

created 2 years ago
9,507 stars

Top 5.4% on sourcepulse

GitHubView on GitHub
Project Summary

ChatRWKV provides a ChatGPT-like conversational AI experience powered by the RWKV (Receptance Weighted Key Value) language model. It targets developers and researchers seeking an open-source alternative to Transformer-based models, offering comparable quality and scalability with improved speed and reduced VRAM usage due to its RNN architecture.

How It Works

ChatRWKV leverages the RWKV architecture, a novel RNN that achieves Transformer-level performance and scalability. Unlike traditional RNNs, RWKV processes sequences in a time-mixing manner, allowing it to capture long-range dependencies effectively. This RNN design inherently supports stateful inference, enabling efficient sequential processing and significant VRAM savings compared to Transformers, which require recomputing attention over the entire sequence.

Quick Start & Requirements

  • Install: pip install rwkv
  • Prerequisites: Python, PyTorch. For CUDA acceleration, CUDA Toolkit and ninja are required. Building CUDA kernels involves setting environment variables (RWKV_CUDA_ON=1, PATH, LD_LIBRARY_PATH) and potentially reinstalling CUDA with VC++ extensions for Windows.
  • Resources: VRAM requirements vary by model size; 3GB VRAM is cited for running a 14B parameter model with specific optimizations.
  • Demos & Docs: API_DEMO_CHAT.py, RWKV Homepage, RWKV-LM Repo.

Highlighted Details

  • RWKV-6 architecture (arXiv:2404.05892) offers state-of-the-art performance.
  • Supports various inference strategies (e.g., cuda fp16, INT8) for optimized speed and VRAM usage.
  • Community-driven projects offer Vulkan, cuBLAS, CLBlast, and ggml-based inference for broad hardware support.
  • Fine-tuning methods like LoRA and QLoRA are supported via community projects.

Maintenance & Community

  • Active community with 7k+ members on Discord (https://discord.gg/bDSBUMeFpc).
  • Project lead: BlinkDL.
  • Training sponsored by Stability AI and EleutherAI.

Licensing & Compatibility

  • The primary license is not explicitly stated in the README, but the underlying RWKV model weights are generally available under permissive licenses (e.g., Apache 2.0 for some versions). Compatibility for commercial use should be verified with specific model weights.

Limitations & Caveats

  • Building CUDA kernels requires specific build tools and environment configurations, potentially leading to setup complexity.
  • While aiming for ChatGPT parity, performance can vary significantly based on model size, quantization, and inference strategy.
  • The README mentions "Raven"-series models are "almost ChatGPT-like," implying potential differences in capability.
Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
43 stars in the last 90 days

Explore Similar Projects

Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Daniel Gross Daniel Gross(Cofounder of Safe Superintelligence), and
13 more.

RWKV-LM by BlinkDL

0.2%
14k
RNN for LLM, transformer-level performance, parallelizable training
created 4 years ago
updated 1 week ago
Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Michael Han Michael Han(Cofounder of Unsloth), and
1 more.

ktransformers by kvcache-ai

0.4%
15k
Framework for LLM inference optimization experimentation
created 1 year ago
updated 2 days ago
Feedback? Help us improve.