This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
12701e8af29ba20a4bfc37edf3b30901e8789d18
vllm
/
vllm
/
v1
/
sample
/
ops
History
Michael Goin
51b2333be1
[Perf] Optimize top-k search in apply_top_k_top_p_triton sampler (
#37225
)
...
Signed-off-by: mgoin <
mgoin64@gmail.com
>
2026-03-17 11:35:17 -07:00
..
__init__.py
[V1] Use FlashInfer Sampling Kernel for Top-P & Top-K Sampling (
#11394
)
2024-12-27 09:32:38 +09:00
bad_words.py
[BugFix] Fix bad words with speculative decoding (
#31908
)
2026-01-07 15:46:42 -05:00
logprobs.py
Convert formatting to use
ruff
instead of
yapf
+
isort
(
#26247
)
2025-10-05 07:06:22 -07:00
penalties.py
[BugFix] Fix mixed penalties batch with async scheduling (
#27910
)
2025-11-01 10:51:24 -07:00
topk_topp_sampler.py
remove cuda check in
top_k_top_p_triton
kernel (
#35011
)
2026-02-24 22:22:31 -08:00
topk_topp_triton.py
[Perf] Optimize top-k search in apply_top_k_top_p_triton sampler (
#37225
)
2026-03-17 11:35:17 -07:00