This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
59e17dd4a0f058e0bef38a371316b87fee3e882e
vllm
/
vllm
/
attention
History
Matthew Bonanni
7ba32aa60b
[Attention][FlashInfer] Enable FP8 FlashInfer (TRTLLM) MLA decode (
#24705
)
...
Signed-off-by: Matthew Bonanni <
mbonanni001@gmail.com
>
2025-09-12 15:45:53 -06:00
..
backends
[Attention][FlashInfer] Enable FP8 FlashInfer (TRTLLM) MLA decode (
#24705
)
2025-09-12 15:45:53 -06:00
layers
[Bugfix] Fix incorrect import of CacheConfig (
#24631
)
2025-09-11 01:48:25 -07:00
ops
[torch.compile][ROCm][V1] Enable attention output FP8 fusion for V1 attention backends (
#19767
)
2025-09-10 13:59:55 -07:00
utils
[Attention] FlashAttn MLA (
#14258
)
2025-09-04 02:47:59 -07:00
__init__.py
Remove duplicate entry in vllm.attention.__all__ (
#23296
)
2025-08-20 17:14:59 -07:00
layer.py
[Multi Modal] Add FA3 in VIT (
#24347
)
2025-09-12 21:27:24 +08:00
selector.py
[gpt-oss] Enable gpt-oss on ampere (
#22714
)
2025-08-12 03:21:44 -07:00