vllm/vllm/model_executor at 3633035a3fdee20cca8a8deb72490dc9cacea0f8 - vllm

Files

Matthew Bonanni 66e674cdd5 [Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments (#26315 )

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>

2025-12-05 09:48:43 -08:00

layers

[Compressed Tensors] Add XPU wNa16 support (#29484 )

2025-12-05 22:02:09 +08:00

model_loader

[Bugfix][Quantization] Support BF16 tensors on GGUF (#29948 )

2025-12-03 10:33:46 +00:00

models

[Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments (#26315 )

2025-12-05 09:48:43 -08:00

warmup

[Core] Encoder separation for Encode-Prefill-Decode Disaggregation (#25233 )