This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
4e824d1c835d9b57db621297e8d9119bfc32fb2e
vllm
/
tests
/
v1
/
worker
History
Wentao Ye
c59a132f96
[V0 Deprecation] Refactor kv cache from list to element (
#37487
)
...
Signed-off-by: yewentao256 <
zhyanwentao@126.com
>
2026-03-23 20:10:11 -07:00
..
__init__.py
[V1] Adding min tokens/repetition/presence/frequence penalties to V1 sampler (
#10681
)
2024-12-26 19:02:58 +09:00
test_gpu_input_batch.py
[Bugfix] Fix pooling non-determinism from pinned prompt_lens aliasing (
#37775
)
2026-03-22 03:22:24 +00:00
test_gpu_model_runner.py
[V0 Deprecation] Refactor kv cache from list to element (
#37487
)
2026-03-23 20:10:11 -07:00
test_gpu_profiler.py
Support custom URI schemes and trace handlers for profiler (
#32393
)
2026-01-22 09:45:40 -08:00
test_late_interaction_runner.py
[Perf] Optimize compute maxsim using batched version, 3.2% E2E throughput improvement (
#36710
)
2026-03-12 08:37:01 +08:00
test_mamba_utils.py
[Hybrid] calling get_mamba_groups() once at MambaCopyBuffers.create() (
#37318
)
2026-03-21 09:29:43 +00:00
test_utils.py
[V0 Deprecation] Refactor kv cache from list to element (
#37487
)
2026-03-23 20:10:11 -07:00
test_worker_memory_snapshot.py
[Hardware] Replace torch.cuda.device_count/current_device/set_device API (
#36145
)
2026-03-12 07:57:47 -07:00