vllm/vllm/model_executor at a8f66cbde878d1ddca2288313041dbe3a556dbc4 - vllm

Files

Kunshang Ji 16d2ad1d38 [Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache (#30681 )

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

2026-03-04 09:49:47 +00:00

kernels

[CPU][Feat] Enable KleidiAI INT8_W4A8 for all input dtypes (#34890 )

2026-02-26 05:00:10 +00:00

layers

[Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache (#30681 )

2026-03-04 09:49:47 +00:00

model_loader

[Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache (#30681 )

2026-03-04 09:49:47 +00:00

models

[Bugfix] Add missing dynamic_arg_dims for Qwen3-ASR torch.compile (#35869 )

2026-03-04 08:29:01 +00:00

offloader

[offloader] v2: Hide weight onloading latency via prefetching (#29941 )

2026-02-25 17:20:59 -08:00

warmup

[MoE Refactor] Create MK for TRTLLM Kernels (#32564 )

2026-03-03 10:39:50 -08:00

__init__.py

[Platform] Deprecate seed_everything (#31659 )

2026-01-04 18:34:04 -08:00

custom_op.py

[Model Bash][DSR1] Add selective dynamic shape marking for CustomOp (#34900 )

2026-02-21 19:28:01 -05:00

parameter.py

[QeRL] Layerwise Reloading (#32133 )

2026-01-30 08:50:05 -07:00

utils.py

[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (#33262 )

2026-01-29 16:52:11 +08:00