This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
0578e5a462dff347ee475913da7c2f91f60c9bc3
vllm
/
vllm
/
v1
/
attention
/
backends
History
Chengji Yao
0578e5a462
[Hardware][TPU]Enable ragged paged attention kernel and resolve recompilation issue (
#14310
)
...
Signed-off-by: Chengji Yao <
chengjiyao@google.com
>
2025-03-06 23:31:05 +00:00
..
mla
[V1][Bugfix] Standardize quantized kv cache rejection for attention backends (
#14221
)
2025-03-06 14:18:29 -08:00
__init__.py
[V1] Implement vLLM V1 [1/N] (
#9289
)
2024-10-22 01:24:07 -07:00
flash_attn.py
[V1][Bugfix] Standardize quantized kv cache rejection for attention backends (
#14221
)
2025-03-06 14:18:29 -08:00
pallas.py
[Hardware][TPU]Enable ragged paged attention kernel and resolve recompilation issue (
#14310
)
2025-03-06 23:31:05 +00:00
rocm_attn.py
[Kernel] [V1] Improved performance for V1 Triton (ROCm) backend (
#14152
)
2025-03-06 07:39:16 -08:00