vllm/vllm/model_executor at 4ca3fa6bb4633fed1196292f764ce8cf13f647b5 - vllm

Files

Jim Smith 4120a05ff1 Fix AttributeError in Qwen3.5 GDN layers with quantized models (#37448 )

Signed-off-by: Jim Smith <jim@joshua8.ai>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Xin Yang <105740670+xyang16@users.noreply.github.com>

2026-03-19 19:21:14 -04:00

kernels

[Torch 2.11] Guard torch._C._cpu attribute checks for forward compatibility (#35673 )

2026-03-17 18:47:59 +00:00

layers

[MoE Refactor] Rename "naive" all2all backend (#36294 )

2026-03-19 15:50:34 -04:00

model_loader

[BUG] Exclude SKIP_TENSORS from get_layer_size() + new weight sync example for dpep (#37334 )

2026-03-19 00:45:10 +00:00

models

Fix AttributeError in Qwen3.5 GDN layers with quantized models (#37448 )

2026-03-19 19:21:14 -04:00

offloader

Bugfix for offloading+prefetch for GLM-4.7-FP8 (#37178 )

2026-03-17 21:22:09 +08:00

warmup

[Bugfix] Fix AttributeError when serving MXFP8 models with DeepGEMM installed (#37358 )

2026-03-19 17:58:33 +00:00

__init__.py

[Platform] Deprecate seed_everything (#31659 )

2026-01-04 18:34:04 -08:00

custom_op.py

Add ability to replace oot ops when using lora (#37181 )

2026-03-16 18:04:15 -07:00

parameter.py

[QeRL] Layerwise Reloading (#32133 )

2026-01-30 08:50:05 -07:00

utils.py

[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (#33262 )

2026-01-29 16:52:11 +08:00