This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
89e08d6d180c76019daa5aa1bbf7759dfaedde2e
vllm
/
vllm
/
model_executor
History
Shane A
89e08d6d18
[Model] Add Olmo3 model implementation (
#24534
)
...
Signed-off-by: Shane A <
shanea@allenai.org
> Co-authored-by: Isotr0py <
mozf@mail2.sysu.edu.cn
>
2025-09-13 03:26:21 +00:00
..
layers
[Attention][FlashInfer] Enable FP8 FlashInfer (TRTLLM) MLA decode (
#24705
)
2025-09-12 15:45:53 -06:00
model_loader
[Docs] Fix warnings in mkdocs build (continued) (
#24740
)
2025-09-12 06:43:15 -07:00
models
[Model] Add Olmo3 model implementation (
#24534
)
2025-09-13 03:26:21 +00:00
warmup
[Startup] Make DeepGEMM warmup scale with max-num-batched-tokens (
#24693
)
2025-09-11 20:10:19 -04:00
__init__.py
[Misc] Add SPDX-FileCopyrightText (
#19100
)
2025-06-03 11:20:17 -07:00
custom_op.py
[V0 deprecation] Deprecate V0 Neuron backend (
#21159
)
2025-09-06 16:15:18 -07:00
parameter.py
[Core] Allow disabling TP sharding for parallel Linear layer (
#23024
)
2025-09-05 22:53:58 -07:00
sampling_metadata.py
[Doc]: fix typos in Python comments (
#24042
)
2025-09-01 19:07:45 -07:00
utils.py
[Bugfix] Fix _synced_weight_loader (
#24565
)
2025-09-11 16:52:33 +08:00