This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
eac2dc2b410dc11af4b424802e86ef9d36bac28a
vllm
/
tests
/
v1
/
worker
History
Wentao Ye
7279374f91
[Perf] Compute maxsim in worker side, reducing redundant copies, 2.7% E2E throughput improvement (
#36159
)
...
Signed-off-by: yewentao256 <
zhyanwentao@126.com
>
2026-03-09 20:55:58 -07:00
..
__init__.py
[V1] Adding min tokens/repetition/presence/frequence penalties to V1 sampler (
#10681
)
2024-12-26 19:02:58 +09:00
test_gpu_input_batch.py
[Chore] Separate out
vllm.utils.platform_utils.py
(
#27374
)
2025-10-23 19:08:06 +00:00
test_gpu_model_runner.py
[V0 Deprecation] Remove unused swap_space parameter (
#36216
)
2026-03-07 22:09:55 +08:00
test_gpu_profiler.py
Support custom URI schemes and trace handlers for profiler (
#32393
)
2026-01-22 09:45:40 -08:00
test_late_interaction_runner.py
[Perf] Compute maxsim in worker side, reducing redundant copies, 2.7% E2E throughput improvement (
#36159
)
2026-03-09 20:55:58 -07:00
test_mamba_utils.py
[perf] Use pinned memory for async H2D transfer in do_mamba_copy_block (
#35480
)
2026-02-28 01:50:37 +08:00
test_utils.py
[5/N][Attention] Finish eliminating
vllm/attention
folder (
#32064
)
2026-01-27 10:02:51 -05:00
test_worker_memory_snapshot.py
[1/N] Elastic EP Milestone 2 (
#34861
)
2026-02-28 04:46:42 +00:00