MuKV
Paper: MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering Code: IMBALDY/MuKV Background Long streaming VideoQA has a simple but painful constraint: the video keeps arriving, while the future user questions are unknown. KV-cache methods such as ReKV make this setting more practical. Instead of recomputing historical video tokens when a question arrives, the model can prefill the video stream in advance, store the visual KV cache, retrieve the relevant cache blocks later, and answer with much lower online cost. ...