StreamingBench

This post is a small index for the benchmarks that appear repeatedly in recent streaming video / long-video VLM papers. The main split is simple: online streaming benchmarks test whether the model can answer while the video is still coming in; offline long-video benchmarks test long-context video understanding, but usually assume the whole video is already available; standard video QA benchmarks are useful for comparability, but they are not the real target of streaming-memory papers. The newer VideoRAG papers add another emphasis: ...