vllm/vllm/model_executor at d1481ba78323bcba5937f5ff74f3a8d27ab54f88 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

bnellnm d1481ba783 [MoE Refactor] Introduce MoERunner abstraction and move execution logic from FusedMoE to DefaultMoERunner (#32344 )

Signed-off-by: Bill Nell <bnell@redhat.com>

2026-02-10 19:51:07 -05:00

..

[MoE Refactor] Introduce MoERunner abstraction and move execution logic from FusedMoE to DefaultMoERunner (#32344 )

2026-02-10 19:51:07 -05:00

[Misc][Spec Decode] support different load config for draft model (#34022 )

2026-02-10 14:52:43 -08:00

[Bugfix] Fix mamba cache dtype for Qwen3.5 (#34200 )

2026-02-10 13:12:31 -08:00

[Kernel] Add KernelConfig flag to enable/disable FlashInfer autotune (#34006 )

2026-02-07 05:24:44 -08:00

__init__.py

[Platform] Deprecate seed_everything (#31659 )

2026-01-04 18:34:04 -08:00

custom_op.py

[torch.compile] Compile CustomOp.forward_native for SiluAndMul and QuantFP8 to avoid raw torch ops inside opaque custom ops (#32806 )

2026-01-22 19:52:26 -08:00

parameter.py

[QeRL] Layerwise Reloading (#32133 )

2026-01-30 08:50:05 -07:00

utils.py

[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (#33262 )

2026-01-29 16:52:11 +08:00