vllm/vllm/attention at d3cf61b89bc53aa7709932ab43e7630b9a71f2b3 - vllm

Files

Lucas Wilkinson d8bccde686 [BugFix] Fix vllm_flash_attn install issues (#17267 )

Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Aaron Pham <contact@aarnphm.xyz>

2025-04-27 17:27:56 -07:00

backends

[BugFix] Fix vllm_flash_attn install issues (#17267 )

2025-04-27 17:27:56 -07:00

ops

[Kernel][Triton][FP8] Adding fp8 and variable length sequence support to Triton FAv2 kernel (#12591 )

2025-04-27 00:35:08 +00:00

utils

[BugFix] Fix vllm_flash_attn install issues (#17267 )

2025-04-27 17:27:56 -07:00

__init__.py

[Attention] Flash Attention 3 - fp8 (#14570 )

2025-03-20 01:14:20 -04:00

layer.py

[Quantization][FP8] Add support for FP8 models with input_scale for output projection and QK quantization (#15734 )

2025-04-25 00:45:02 -07:00

selector.py

Correct capitalisation: VLLM -> vLLM (#14562 )

2025-03-10 16:36:21 +00:00