Logo
Explore Help
Register Sign In
biondizzle/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions 2 Packages Projects Releases Wiki Activity
Files
e74ff409e0f8f3cacb8a251a1cae8b478721cead
vllm/vllm/v1/worker
History
Chengji Yao e74ff409e0 [TPU] support disabling xla compilation cache (#15567)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
2025-03-27 00:09:28 +00:00
..
__init__.py
[V1] Implement vLLM V1 [1/N] (#9289)
2024-10-22 01:24:07 -07:00
block_table.py
Update deprecated Python 3.8 typing (#13971)
2025-03-02 17:34:51 -08:00
gpu_input_batch.py
[V1] Aggregate chunked prompt logprobs in model runner (#14875)
2025-03-24 12:27:57 -04:00
gpu_model_runner.py
[V1][Spec Decode] Update target_logits in place for rejection sampling (#15427)
2025-03-24 21:04:41 -07:00
gpu_worker.py
[v1] Refactor KVCacheConfig (#14079)
2025-03-21 04:56:27 -07:00
lora_model_runner_mixin.py
[Kernels] LoRA - Retire SGMV and BGMV Kernels (#14685)
2025-03-18 09:47:53 +00:00
tpu_model_runner.py
[V1] TPU - Revert to exponential padding by default (#15565)
2025-03-26 21:35:05 +00:00
tpu_worker.py
[TPU] support disabling xla compilation cache (#15567)
2025-03-27 00:09:28 +00:00
worker_base.py
[v1] Refactor KVCacheConfig (#14079)
2025-03-21 04:56:27 -07:00
Powered by Gitea Version: 1.25.2 Page: 226ms Template: 2ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API