Vadim Gimpelson
|
82e64c7a20
|
[PERF] [Qwen3-next] Speed up gated RMSNorm (#26207)
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
Signed-off-by: Vadim Gimpelson <156319763+vadiklyutiy@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-12 08:27:50 +00:00 |
|
Harry Mellor
|
b893d661b1
|
Fix per file ruff ignores related to simplification (#26259)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 20:31:53 +00:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Corey Lowman
|
d747c2ef18
|
[Perf] Fix jit compiles at runtime of fla gated delta rule (#25432)
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-09-24 11:16:13 +08:00 |
|
co63oc
|
e26fef8397
|
fix some typos (#24616)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
|
2025-09-11 10:48:46 -07:00 |
|
Tao He
|
e93f4cc9e3
|
Add the support for the qwen3 next model (a hybrid attention model). (#24526)
Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-09-11 15:32:09 +08:00 |
|
youkaichao
|
1aa427fdc1
|
[Kernels] Add Flash Linear Attention Kernels (#24518)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-09-10 00:04:41 +08:00 |
|