This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
66e86f1dbd565292a253e7d2d6851f65dc4f14ba
vllm
/
vllm
/
v1
/
pool
History
wang.yuqi
a9b4f07ba2
[Frontend] Re-enable running MaxSim on GPU (
#38620
)
...
Signed-off-by: wang.yuqi <
yuqi.wang@daocloud.io
>
2026-04-03 00:03:13 +08:00
..
__init__.py
Support embedding models in V1 (
#16188
)
2025-06-18 21:36:33 -07:00
late_interaction.py
[Frontend] Re-enable running MaxSim on GPU (
#38620
)
2026-04-03 00:03:13 +08:00
metadata.py
[Perf] Remove redundant device copies for CPU-only pooling token IDs, 48.9% E2E throughput improvement (
#38139
)
2026-03-29 18:12:50 +00:00