vllm/csrc/attention at 4cf256ae7f8b0be8f06f6b85821e55d4f5bdaa13 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

bnellnm 5467ac3196 [Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

..

attention_dtypes.h

Enable scaled FP8 (e4m3fn) KV cache on ROCm (AMD GPU) (#3290 )

2024-04-03 14:15:55 -07:00

attention_generic.cuh

[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722 )

2024-05-22 07:18:41 +00:00

attention_kernels.cu

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

attention_utils.cuh

[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722 )

2024-05-22 07:18:41 +00:00

dtype_bfloat16.cuh

[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722 )

2024-05-22 07:18:41 +00:00

dtype_float16.cuh

[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722 )

2024-05-22 07:18:41 +00:00

dtype_float32.cuh

[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722 )

2024-05-22 07:18:41 +00:00

dtype_fp8.cuh

[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722 )

2024-05-22 07:18:41 +00:00