vllm/vllm/v1/sample at 12701e8af29ba20a4bfc37edf3b30901e8789d18 - vllm

Files

Sungjae Lee 4731884796 [Feature] limit thinking tokens (hard limit) (#20859 )

Signed-off-by: Sungjae Lee <33976427+llsj14@users.noreply.github.com>
Signed-off-by: Sungjae Lee <sung-jae.lee@navercorp.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

2026-03-24 09:53:07 -07:00

logits_processor

[Feature] limit thinking tokens (hard limit) (#20859 )

2026-03-24 09:53:07 -07:00

ops

[Perf] Optimize top-k search in apply_top_k_top_p_triton sampler (#37225 )

2026-03-17 11:35:17 -07:00

__init__.py

[V1] Implement vLLM V1 [1/N] (#9289 )

2024-10-22 01:24:07 -07:00

metadata.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )