vllm/vllm/lora at ef96fa3f1f15e08769e70ee3e335b5b4d7e6a6ee - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Xin Yang 543c23be78 [LoRA][Perf] Improve FusedMoE LoRA performance for small rank (#32019 )

Signed-off-by: Xin Yang <xyangx@amazon.com>

2026-01-10 11:04:18 -08:00

..

[LoRA][Perf] Improve FusedMoE LoRA performance for small rank (#32019 )

2026-01-10 11:04:18 -08:00

[LoRA][Perf] Improve FusedMoE LoRA performance for small rank (#32019 )

2026-01-10 11:04:18 -08:00

[Refactor][TPU] Remove torch_xla path and use tpu-inference (#30808 )

2026-01-07 16:07:16 +08:00

__init__.py

[Experimental] Add multi-LoRA support (#1804 )

2024-01-23 15:26:37 -08:00

lora_model.py

[Bugfix] Fix MoE LoRA bin/pt loading (#31161 )

2025-12-23 19:09:15 +08:00

lora_weights.py

[LoRA] Continue optimizing MoE LoRA weight loading (#29322 )

2025-11-27 05:56:28 -08:00

model_manager.py

[Bugfix] Fix typo in FusedMoE LoRA reshape comment (#31992 )

2026-01-08 18:46:05 -08:00

peft_helper.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

request.py

[Misc] Fix Qwen2-MoE shared_expert_gate (#31339 )

2025-12-26 05:10:39 +00:00

resolver.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

utils.py

[Bugfix] Fix MoE LoRA bin/pt loading (#31161 )

2025-12-23 19:09:15 +08:00

worker_manager.py

[Core] Initialize LoRA support for tower and connector in multi-modal models (#26674 )

2025-12-26 04:48:20 -08:00