This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
bb59c902480ddb054e7f3f0762b386e0d4e269bd
vllm
/
vllm
/
model_executor
History
bnellnm
5bff999d12
[Bugfix] Add method to swap quant_method on FusedMoE to fix LoRA issues (
#34453
)
...
Signed-off-by: Bill Nell <
bnell@redhat.com
>
2026-02-15 20:10:50 -08:00
..
layers
[Bugfix] Add method to swap quant_method on FusedMoE to fix LoRA issues (
#34453
)
2026-02-15 20:10:50 -08:00
model_loader
[Feature] Support CPU Offloading without Pytorch Pinned Memory that leads to doubled allocation (
#32993
)
2026-02-13 08:11:26 -08:00
models
[CI/Build] Enable tests for recent day-0 new models (
#34585
)
2026-02-15 18:17:04 -08:00
warmup
[Kernel] Add KernelConfig flag to enable/disable FlashInfer autotune (
#34006
)
2026-02-07 05:24:44 -08:00
__init__.py
[Platform] Deprecate seed_everything (
#31659
)
2026-01-04 18:34:04 -08:00
custom_op.py
[torch.compile] Compile
CustomOp.forward_native
for
SiluAndMul
and
QuantFP8
to avoid raw torch ops inside opaque custom ops (
#32806
)
2026-01-22 19:52:26 -08:00
parameter.py
[QeRL] Layerwise Reloading (
#32133
)
2026-01-30 08:50:05 -07:00
utils.py
[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (
#33262
)
2026-01-29 16:52:11 +08:00