RWKV (Receptance Weighted Key Value)

ONLINE

Architecture RNN

Memory Scaling O(1) Linear

Recommended Interface RWKV Runner

Deployment Guide


# RWKV is best run via the official desktop app
# Download RWKV Runner from GitHub releases

# Or deploy via Python:
git clone https://github.com/BlinkDL/ChatRWKV
cd ChatRWKV
pip install -r requirements.txt
python chat.py

⚠️ Requires 8GB VRAM for 3B parameter model.

RWKV is an RNN with Transformer-level performance. It completely drops the KV cache, meaning inference uses a fixed amount of VRAM regardless of how long the context window gets.