vllm/vllm/model_executor at c0817e4d39c78335952fb4b6bfae3cd3e45ac4c3 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

bsliu c0817e4d39 [Model] Add support for Cheers multimodal model (#38788 )

Signed-off-by: bsliu <1187291748@qq.com>
Signed-off-by: 吴炳贤 <wubingxian24@mails.ucas.ac.cn>

2026-04-02 21:01:40 +08:00

..

[OOT] Add OOT support for linear kernel. (#37989 )

2026-03-31 14:33:21 +08:00

[Feature] KV cache per-token-head INT8/FP8 quantization (#38378 )

2026-04-02 08:13:26 -04:00

[Quantization] Consolidate dummy format logic into DummyModelLoader (#38637 )

2026-03-31 22:20:45 +00:00

[Model] Add support for Cheers multimodal model (#38788 )

2026-04-02 21:01:40 +08:00

Bugfix for offloading+prefetch for GLM-4.7-FP8 (#37178 )

2026-03-17 21:22:09 +08:00

[Bugfix] Fix AttributeError when serving MXFP8 models with DeepGEMM installed (#37358 )

2026-03-19 17:58:33 +00:00

__init__.py

…

custom_op.py

Add ability to replace oot ops when using lora (#37181 )

2026-03-16 18:04:15 -07:00

parameter.py

[Mypy] Fix mypy for vllm/model_executor (except vllm/model_executor/layers) (#37904 )

2026-03-24 17:14:01 +00:00

utils.py

[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (#33262 )

2026-01-29 16:52:11 +08:00