This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
6ffa3f314c59e42238f1c5f923ff2839e0af9698
vllm
/
vllm
/
attention
/
ops
/
blocksparse_attention
History
Cyrus Leung
6ffa3f314c
[CI/Build] Avoid CUDA initialization (
#8534
)
2024-09-18 10:38:11 +00:00
..
__init__.py
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (
#4799
)
2024-05-24 22:00:52 -07:00
blocksparse_attention_kernel.py
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (
#4799
)
2024-05-24 22:00:52 -07:00
interface.py
[CI/Build] Avoid CUDA initialization (
#8534
)
2024-09-18 10:38:11 +00:00
utils.py
[Model][Phi3-Small] Remove scipy from blocksparse_attention (
#6343
)
2024-07-12 10:47:17 +08:00