vllm/vllm/model_executor at 32693db8cea5cb9099c4e9d9876def97fdbc5387 - vllm

Files

HZY 32693db8ce [Bugfix] [Qwen3.5]Fix Qwen3.5 FP8 quantization: tuple shard_id weight loading (#35289 )

Signed-off-by: daowu.hzy <daowu.hzy@alibaba-inc.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>

2026-02-26 18:26:15 +08:00

kernels

[CPU][Feat] Enable KleidiAI INT8_W4A8 for all input dtypes (#34890 )

2026-02-26 05:00:10 +00:00

layers

[Bugfix] [Qwen3.5]Fix Qwen3.5 FP8 quantization: tuple shard_id weight loading (#35289 )

2026-02-26 18:26:15 +08:00

model_loader

Revert "[Misc] Enable weights loading tracking for quantized models" (#35309 )

2026-02-25 09:20:15 -08:00

models

[Model] Ring 2.5 (#35102 )

2026-02-26 02:17:11 -08:00

offloader

[offloader] v2: Hide weight onloading latency via prefetching (#29941 )

2026-02-25 17:20:59 -08:00

warmup

[Platform] Add current_platform.num_compute_units interface (#35042 )

2026-02-24 22:22:49 -08:00

__init__.py

[Platform] Deprecate seed_everything (#31659 )

2026-01-04 18:34:04 -08:00

custom_op.py

[Model Bash][DSR1] Add selective dynamic shape marking for CustomOp (#34900 )

2026-02-21 19:28:01 -05:00

parameter.py

[QeRL] Layerwise Reloading (#32133 )

2026-01-30 08:50:05 -07:00

utils.py

[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (#33262 )

2026-01-29 16:52:11 +08:00