vllm/vllm/model_executor at f0bca83ee4b6aa6d63da66d37dd69929bdcfc1fe - vllm

Files

Dimitrios Bariamis f0bca83ee4 Add support for Mistral Large 3 inference with Flashinfer MoE (#33174 )

Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

2026-01-30 22:48:27 -08:00

layers

Add support for Mistral Large 3 inference with Flashinfer MoE (#33174 )

2026-01-30 22:48:27 -08:00

model_loader

fix QERL attention import path (#33432 )

2026-01-30 09:29:09 -08:00

models

Add support for Mistral Large 3 inference with Flashinfer MoE (#33174 )

2026-01-30 22:48:27 -08:00

warmup

[MoE Refactor] Integrate Naive Prepare Finalize into MK (#32567 )

2026-01-27 01:28:02 +00:00

__init__.py

[Platform] Deprecate seed_everything (#31659 )

2026-01-04 18:34:04 -08:00

custom_op.py

[torch.compile] Compile CustomOp.forward_native for SiluAndMul and QuantFP8 to avoid raw torch ops inside opaque custom ops (#32806 )

2026-01-22 19:52:26 -08:00

parameter.py

[QeRL] Layerwise Reloading (#32133 )

2026-01-30 08:50:05 -07:00

utils.py

[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (#33262 )

2026-01-29 16:52:11 +08:00