Xin Yang
|
0ada960a20
|
[Kernel] Support bias type in grouped_topk kernel (#31781)
Signed-off-by: Xin Yang <xyangx@amazon.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-01-07 12:16:32 -08:00 |
|
Wentao Ye
|
f21f5ea38c
|
[Refactor] Small refactor for group topk (#30562)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2025-12-16 14:50:59 -05:00 |
|
Wentao Ye
|
61249b177d
|
[Refactor] Remove useless syncwarp (#30510)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-12-11 17:43:41 -05:00 |
|
Wentao Ye
|
0ee6416f67
|
[Perf] Optimize group_topk kernel, 1.9% Throughput improvement, 2.1% TPOT improvemnt (#30159)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-12-08 19:44:01 -05:00 |
|
Michael Goin
|
0852527647
|
[Perf][DeepSeek] Add sigmoid+bias fusion to fused_grouped_topk from TRTLLM (#28124)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2025-11-07 18:20:55 -08:00 |
|
Ming Yang
|
527821d191
|
Use macro guard CUDA functions for back compatibility in grouped_topk_kernel.cu (#25346)
Signed-off-by: Ming Yang <minos.future@gmail.com>
Signed-off-by: Rahul Tuli <rtuli@redhat.com>
Co-authored-by: Rahul Tuli <rtuli@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-09-23 09:45:39 -07:00 |
|
Lumina
|
81b16a2bc9
|
[Kernel] Better inf handling for grouped topk cu (#24886)
Signed-off-by: lumina37 <starry.qvq@gmail.com>
|
2025-09-18 05:53:55 +00:00 |
|
Qiming Zhang
|
e919d6f549
|
[Kernel][Bugfix] Fix grouped topk cu (#24146)
Signed-off-by: mayuyuace <qiming1.zhang@intel.com>
|
2025-09-04 12:37:37 +08:00 |
|
Xin Yang
|
8a3cd90af5
|
[Kernel] Add fused grouped_topk kernel for MoE (#23274)
Signed-off-by: Xin Yang <xyangx@amazon.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2025-08-25 11:47:52 -07:00 |
|