This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
192ad4648b2066ebdf1fa04ad84f24bdf0cd6533
vllm
/
vllm
/
model_executor
History
Isotr0py
192ad4648b
[Bugfix] Fix interns1-pro initialization and PP (
#33793
)
...
Signed-off-by: Isotr0py <
mozf@mail2.sysu.edu.cn
>
2026-02-04 17:54:45 +00:00
..
layers
[PERF] Change GDN Attention State Layout from [N, HV, K, V] to [N, HV, V, K] (
#33291
)
2026-02-04 11:20:52 +00:00
model_loader
fix memory for online fp8 quantization with streaming weight load (
#31914
)
2026-02-02 14:17:42 -05:00
models
[Bugfix] Fix interns1-pro initialization and PP (
#33793
)
2026-02-04 17:54:45 +00:00
warmup
[MoE Refactor] Integrate Naive Prepare Finalize into MK (
#32567
)
2026-01-27 01:28:02 +00:00
__init__.py
[Platform] Deprecate seed_everything (
#31659
)
2026-01-04 18:34:04 -08:00
custom_op.py
[torch.compile] Compile
CustomOp.forward_native
for
SiluAndMul
and
QuantFP8
to avoid raw torch ops inside opaque custom ops (
#32806
)
2026-01-22 19:52:26 -08:00
parameter.py
[QeRL] Layerwise Reloading (
#32133
)
2026-01-30 08:50:05 -07:00
utils.py
[BugFix] Fix EPLB fail for MoeFP4 model with Marlin backend (
#33262
)
2026-01-29 16:52:11 +08:00