This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
ec870fba9a59b8287fa205e4c35def4d3d153080
vllm
/
vllm
/
attention
History
Nicolò Lucchesi
cfbb8c930f
[TPU][V1] MHA Pallas backend (
#15288
)
...
Signed-off-by: NickLucche <
nlucches@redhat.com
>
2025-03-21 08:50:39 -07:00
..
backends
[Bugfix] detect alibi and revert to FA2 (
#15231
)
2025-03-20 19:20:16 -07:00
ops
[Kernel] [V1] Further optimizations to ROCm (Triton) Backend to better handle GQA. (
#14431
)
2025-03-13 20:42:27 -07:00
__init__.py
[Attention] Flash Attention 3 - fp8 (
#14570
)
2025-03-20 01:14:20 -04:00
layer.py
[TPU][V1] MHA Pallas backend (
#15288
)
2025-03-21 08:50:39 -07:00
selector.py
Correct capitalisation:
VLLM
->
vLLM
(
#14562
)
2025-03-10 16:36:21 +00:00