vllm/csrc/cpu at 760e3ecc8fa0cee06eff55fe08f91f621d4e2221 - vllm

Files

Akash kaothalkar e515668edf [Hardware][Power] Enable compressed tensor W8A8 INT8 quantization for POWER (#17153 )

Signed-off-by: Akash Kaothalkar <akash.kaothalkar@ibm.com>
Co-authored-by: Akash Kaothalkar <akash.kaothalkar@ibm.com>
Co-authored-by: mgoin <mgoin64@gmail.com>

2025-05-07 22:35:03 -07:00

activation.cpp

[Kernel][CPU] Add Quick gelu to CPU (#5717 )

2024-06-21 06:39:40 +00:00

attention.cpp

Adding cpu inference with VXE ISA for s390x architecture (#12613 )

2025-03-06 08:40:53 -08:00

cache.cpp

[Kernel][CPU] CPU MLA (#14744 )

2025-03-25 09:34:59 +00:00

cpu_types_arm.hpp

[Bugfix] Explicitly include "omp.h" for MacOS to avoid installation failure (#14051 )

2025-03-02 17:35:01 -08:00

cpu_types_vsx.hpp

[Hardware][Power] Enable compressed tensor W8A8 INT8 quantization for POWER (#17153 )

2025-05-07 22:35:03 -07:00

cpu_types_vxe.hpp

Adding cpu inference with VXE ISA for s390x architecture (#12613 )

2025-03-06 08:40:53 -08:00

cpu_types_x86.hpp

[CPU][Bugfix] Using custom allreduce for CPU backend (#15934 )

2025-04-02 07:46:47 -07:00

cpu_types.hpp

Adding cpu inference with VXE ISA for s390x architecture (#12613 )

2025-03-06 08:40:53 -08:00

dnnl_helper.hpp

[Hardware][CPU] Update torch 2.5 (#9911 )

2024-11-07 04:43:08 +00:00

layernorm.cpp

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

mla_decode.cpp

[Kernel][CPU] CPU MLA (#14744 )

2025-03-25 09:34:59 +00:00

pos_encoding.cpp

Make key optional for rotary embedding (#17566 )

2025-05-07 00:11:46 -07:00

quant.cpp

[Hardware][Power] Enable compressed tensor W8A8 INT8 quantization for POWER (#17153 )

2025-05-07 22:35:03 -07:00

shm.cpp

[CPU][Bugfix] Using custom allreduce for CPU backend (#15934 )

2025-04-02 07:46:47 -07:00

torch_bindings.cpp

[Hardware][Power] Enable compressed tensor W8A8 INT8 quantization for POWER (#17153 )

2025-05-07 22:35:03 -07:00

utils.cpp

[Bugfix] fix gettid method is not define (#16084 )

2025-04-08 19:12:44 -07:00