vllm/csrc/cpu at d3d9cb6e4b8185b4e56e1dda92c6fc31cdc05de1 - vllm - Gitea: Git with a cup of tea

biondizzle/vllm

Files

History

Lucas Wilkinson a8d604ca2a [Misc] Disambiguate quantized types via a new ScalarType (#6396 )

2024-08-02 13:51:58 -07:00

..

activation.cpp

[Kernel][CPU] Add Quick gelu to CPU (#5717 )

2024-06-21 06:39:40 +00:00

attention.cpp

[Kernel][Attention] Separate Attention.kv_scale into k_scale and v_scale (#6081 )

2024-07-16 15:31:32 -07:00

cache.cpp

[Kernel][Attention] Separate Attention.kv_scale into k_scale and v_scale (#6081 )

2024-07-16 15:31:32 -07:00

cpu_types_vsx.hpp

Support CPU inference with VSX PowerPC ISA (#5652 )

2024-06-26 21:53:04 +00:00

cpu_types_x86.hpp

Support CPU inference with VSX PowerPC ISA (#5652 )

2024-06-26 21:53:04 +00:00

cpu_types.hpp

Support CPU inference with VSX PowerPC ISA (#5652 )

2024-06-26 21:53:04 +00:00

layernorm.cpp

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

pos_encoding.cpp

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

torch_bindings.cpp

[Misc] Disambiguate quantized types via a new ScalarType (#6396 )

2024-08-02 13:51:58 -07:00

utils.cpp

[Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (#6125 )

2024-07-26 13:50:10 -07:00