vllm/csrc/attention at 56e19d7ee20635f87e04089bfaa2f54d52db65e9 - vllm

Files

Carl Y 3bc2734dd0 [Kernel] Fuse FP8 output quantization into merge_attn_states (#36518 )

Signed-off-by: Carl You <4531192+carlyou@users.noreply.github.com>

2026-04-03 01:47:04 +00:00

2025-10-16 06:36:09 -07:00

attention_dtypes.h

…

attention_generic.cuh

2024-05-22 07:18:41 +00:00

attention_kernels.cuh

2025-10-08 10:20:48 -04:00

attention_utils.cuh

2024-08-21 16:47:36 -07:00

dtype_bfloat16.cuh

2024-08-05 16:00:01 -04:00

dtype_float16.cuh

…

dtype_float32.cuh

…

dtype_fp8.cuh

…

merge_attn_states.cu

2026-04-03 01:47:04 +00:00

paged_attention_v1.cu

2025-07-24 00:37:19 -07:00

paged_attention_v2.cu

2025-07-24 00:37:19 -07:00

vertical_slash_index.cu

2025-05-12 19:52:47 -07:00