vllm/vllm/model_executor at fcb9df99bd7d0e532bcf2891db2a85bd927605fe - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Roberto L. Castro fcb9df99bd [Perf][Kernel] Optimize FP4 quantization kernels (SM100F) (#32520 )

Signed-off-by: LopezCastroRoberto <rocastro@redhat.com>

2026-01-24 18:45:27 -07:00

..

[Perf][Kernel] Optimize FP4 quantization kernels (SM100F) (#32520 )

2026-01-24 18:45:27 -07:00

Add llmcompressor fp8 kv-cache quant (per-tensor and per-attn_head) (#30141 )

2026-01-22 13:29:57 -07:00

feat: Complete LoRA support for MiniMaxM2 Fixes #32736 (#32763 )

2026-01-24 20:48:46 +08:00

[MoE Refactor] Oracle Select FP8+NVFP4 Kernels In Priority (#32414 )

2026-01-21 08:22:33 -05:00

__init__.py

[Platform] Deprecate seed_everything (#31659 )

2026-01-04 18:34:04 -08:00

custom_op.py

[torch.compile] Compile CustomOp.forward_native for SiluAndMul and QuantFP8 to avoid raw torch ops inside opaque custom ops (#32806 )

2026-01-22 19:52:26 -08:00

parameter.py

[Docs] Replace rst style double-backtick with md single-backtick (#27091 )

2025-10-17 02:47:34 -07:00

utils.py

[Platform] Deprecate seed_everything (#31659 )

2026-01-04 18:34:04 -08:00