vllm/vllm/model_executor at 28028dff2fed19e0face08a303b86273d954979a - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Yan Ma 58cfe0dc44 Fix phi4-mm and remove cuda binding (#35964 )

Signed-off-by: Yan Ma <yan.ma@intel.com>

2026-03-05 01:08:05 +08:00

..

[CPU][Feat] Enable KleidiAI INT8_W4A8 for all input dtypes (#34890 )

2026-02-26 05:00:10 +00:00

[Core] Add All-to-All communication backend for DCP (#34883 )

2026-03-04 10:01:57 -05:00

[Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache (#30681 )

2026-03-04 09:49:47 +00:00

Fix phi4-mm and remove cuda binding (#35964 )

2026-03-05 01:08:05 +08:00

[offloader] v2: Hide weight onloading latency via prefetching (#29941 )

2026-02-25 17:20:59 -08:00

[MoE Refactor] Create MK for TRTLLM Kernels (#32564 )

2026-03-03 10:39:50 -08:00

__init__.py

[Platform] Deprecate seed_everything (#31659 )

2026-01-04 18:34:04 -08:00

custom_op.py

[Model Bash][DSR1] Add selective dynamic shape marking for CustomOp (#34900 )

2026-02-21 19:28:01 -05:00

parameter.py

[QeRL] Layerwise Reloading (#32133 )

2026-01-30 08:50:05 -07:00

utils.py

[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (#33262 )

2026-01-29 16:52:11 +08:00