This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
a9b53dd435bb82f311f340ebebc15e62b9624a9d
vllm
/
csrc
/
quantization
History
Roberto L. Castro
fcb9df99bd
[Perf][Kernel] Optimize FP4 quantization kernels (SM100F) (
#32520
)
...
Signed-off-by: LopezCastroRoberto <
rocastro@redhat.com
>
2026-01-24 18:45:27 -07:00
..
awq
…
cutlass_w4a8
[Kernel]Support W4A8 Grouped GEMM on Hopper (
#29691
)
2025-12-08 19:29:06 -08:00
fp4
[Perf][Kernel] Optimize FP4 quantization kernels (SM100F) (
#32520
)
2026-01-24 18:45:27 -07:00
fused_kernels
[Performance] Fused blockwise quant RMS norm (
#27883
)
2025-12-07 16:38:04 +00:00
gguf
…
gptq
[ROCm][GPTQ][Bugfix] Fix GPTQ GEMM kernel output zeroing race condition (
#30719
)
2025-12-29 01:13:14 -08:00
gptq_allspark
[Refactor] Rename
gptq_marlin
to
marlin
to match MoE (
#32952
)
2026-01-23 16:48:12 -05:00
hadamard
/hadacore
…
machete
Fix typos in comments across multiple files (
#30345
)
2025-12-09 20:05:28 -08:00
marlin
[Refactor] Rename
gptq_marlin
to
marlin
to match MoE (
#32952
)
2026-01-23 16:48:12 -05:00
w8a8
[Refactor] Remove unused cutlass moe problem size function (
#32047
)
2026-01-18 12:46:59 -08:00
activation_kernels.cu
…
utils.cuh
…
vectorization_utils.cuh
…
vectorization.cuh
…