This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
11cec296dd47bbb25c3bb4e40bcc11341d6a2fe2
vllm
/
csrc
/
moe
History
Xin Yang
0ada960a20
[Kernel] Support bias type in grouped_topk kernel (
#31781
)
...
Signed-off-by: Xin Yang <
xyangx@amazon.com
> Co-authored-by: Michael Goin <
mgoin64@gmail.com
>
2026-01-07 12:16:32 -08:00
..
marlin_moe_wna16
[Quantization][MoE] remove unused ep logic from moe marlin (
#31571
)
2026-01-06 09:07:19 -08:00
permute_unpermute_kernels
…
dynamic_4bit_int_moe_cpu.cpp
[CPU]Parallelize over tokens in int4 moe (
#29600
)
2025-12-02 06:21:39 +00:00
grouped_topk_kernels.cu
[Kernel] Support bias type in grouped_topk kernel (
#31781
)
2026-01-07 12:16:32 -08:00
moe_align_sum_kernels.cu
Lora MoE Align Improvements (
#29257
)
2025-12-09 10:35:16 +08:00
moe_ops.h
Lora MoE Align Improvements (
#29257
)
2025-12-09 10:35:16 +08:00
moe_permute_unpermute_op.cu
[Kernel] CUTLASS MoE FP8: Integrate cuda moe permute/unpermute (
#23045
)
2025-08-20 10:35:26 -04:00
moe_wna16_utils.h
…
moe_wna16.cu
…
topk_softmax_kernels.cu
[Kernel][Performance] Fuse float cast and renormalize to topk softmax kernel (
#26717
)
2025-10-17 07:30:35 +00:00
torch_bindings.cpp
[Quantization][MoE] remove unused ep logic from moe marlin (
#31571
)
2026-01-06 09:07:19 -08:00