vllm/vllm/v1/attention at d9c77308776b4d31f03fad8d4129a3d539154166 - vllm

Files

ElizaWszola d9c7730877 [Performance] Extract kv update ops from MLA attention backends (#34627 )

Signed-off-by: ElizaWszola <ewszola@redhat.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Di Wu <dw2761@nyu.edu>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

2026-03-02 10:43:19 -05:00

backends

[Misc] Cleanup useless current_platform import (#35715 )

2026-03-02 09:36:54 +00:00

ops

Flashinfer cuDNN backend for Qwen3 VL ViT attention (#34580 )

2026-02-27 20:20:23 +08:00

__init__.py

[V1] Implement vLLM V1 [1/N] (#9289 )

2024-10-22 01:24:07 -07:00

backend.py

[Performance] Extract kv update ops from MLA attention backends (#34627 )

2026-03-02 10:43:19 -05:00

selector.py

[Attn,KV-cache] Use per-head scales in the attention selector (#34281 )

2026-02-24 09:02:43 -05:00