MuKV

Paper: MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering Code: IMBALDY/MuKV Background Long streaming VideoQA has a simple but painful constraint: the video keeps arriving, while the future user questions are unknown. KV-cache methods such as ReKV make this setting more practical. Instead of recomputing historical video tokens when a question arrives, the model can prefill the video stream in advance, store the visual KV cache, retrieve the relevant cache blocks later, and answer with much lower online cost. ...

June 5, 2026 · Updated June 6, 2026 · 15 min