RWKV (Receptance Weighted Key Value)

ONLINE
Architecture RNN
Memory Scaling O(1) Linear
Recommended Interface RWKV Runner
Deployment Guide

# RWKV is best run via the official desktop app
# Download RWKV Runner from GitHub releases

# Or deploy via Python:
git clone https://github.com/BlinkDL/ChatRWKV
cd ChatRWKV
pip install -r requirements.txt
python chat.py
            
⚠️ Requires 8GB VRAM for 3B parameter model.

RWKV is an RNN with Transformer-level performance. It completely drops the KV cache, meaning inference uses a fixed amount of VRAM regardless of how long the context window gets.