vllm/tests/v1/e2e at 0e9358c11daf3f5a2d4e8f80a100b6d5e070e1a1 - vllm

Files

Andrii Skliar cd7643015e [Feature] Support per-draft-model MoE backend via --speculative-config (#37880 )

Signed-off-by: Andrii Skliar <askliar@nvidia.com>
Signed-off-by: [Andrii Skliar] <askliar@nvidia.com>
Co-authored-by: Andrii Skliar <askliar@nvidia.com>

2026-03-25 14:31:52 +00:00

general

[V0 Deprecation] Refactor kv cache from list to element (#37487 )

2026-03-23 20:10:11 -07:00

spec_decode

[Feature] Support per-draft-model MoE backend via --speculative-config (#37880 )

2026-03-25 14:31:52 +00:00

__init__.py

[V1] Implement Cascade Attention (#11635 )

2025-01-01 21:56:46 +09:00

test_hybrid_chunked_prefill.py

[Attention] Support distinguishing between short extends and decodes (#37303 )

2026-03-20 10:49:36 -07:00