vllm/vllm/model_executor at e01ff5c070f47a0562711e61f5f885d4ccf241c9 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

amirkl94 e01ff5c070 Bugfix: Pass router logits dtype in nemotron shared experts (#32669 )

Signed-off-by: Amir Klein <203507526+amirkl94@users.noreply.github.com>

2026-01-29 09:36:34 +00:00

..

[PluggableLayer][2/N] Apply PluggableLayer to linear layers (#33152 )

2026-01-29 16:53:15 +08:00

[5/N][Attention] Finish eliminating vllm/attention folder (#32064 )

2026-01-27 10:02:51 -05:00

Bugfix: Pass router logits dtype in nemotron shared experts (#32669 )

2026-01-29 09:36:34 +00:00

[MoE Refactor] Integrate Naive Prepare Finalize into MK (#32567 )

2026-01-27 01:28:02 +00:00

__init__.py

[Platform] Deprecate seed_everything (#31659 )

2026-01-04 18:34:04 -08:00

custom_op.py

[torch.compile] Compile CustomOp.forward_native for SiluAndMul and QuantFP8 to avoid raw torch ops inside opaque custom ops (#32806 )

2026-01-22 19:52:26 -08:00

parameter.py

[Quantization][Deprecation] Remove BitBlas (#32683 )

2026-01-28 11:06:22 +00:00

utils.py

[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (#33262 )

2026-01-29 16:52:11 +08:00