This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 08:22:53 +00:00
7073daaffa
fix: allocate token_indices on CPU, move to GPU AFTER JIT compilation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 08:22:14 +00:00
0e7b06b55c
debug: clone + sync token indices before JIT
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 08:20:42 +00:00
70c0618361
fix: allocate token_indices before CuTeDSL JIT compilation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 08:19:47 +00:00
2bbe04efd8
debug: remove assert, test token corruption
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 08:18:38 +00:00
66627926c5
debug: int32 token indices with sync verify
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 08:16:11 +00:00
da02a5dc11
debug: assert token indices are correct after allocation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 08:11:03 +00:00
c0d016a472
feat: compute_activation_global_scales warmup method
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 08:07:12 +00:00
8c9a51e006
fix: call _ensure_stacked in warmup test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 08:06:34 +00:00
5ba77e355f
test: warmup gs computation with safety margin sweep
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:59:02 +00:00
ae6b879d38
fix: pass expert_offsets without leading 0 to GEMM (matches pipeline)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:57:02 +00:00
a1e6f5f891
fix: searchsorted right=True for correct expert assignment
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:53:59 +00:00
ddffb7d8df
docs: current bug analysis — scale_a layout vs expert_offsets mismatch
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:47:16 +00:00
ed90341ea9
fix: scatter+per-expert-swizzle scale assembly (cudagraph-safe)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:43:11 +00:00
37fecb588f
fix: separate L1/L2 scale buffers (different K_sf), fix assembly calls
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:39:51 +00:00
b824b838a9
fix: 128-row-align each expert's scales in padded buffer
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:37:48 +00:00
8dadd9a723
test: scale assembly debug
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:37:05 +00:00
8642946274
fix: padded x_sf buffer for fixed-shape scale assembly
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:35:50 +00:00
418e29f7f5
fix: per-expert scale assembly (match assemble_scales_2d_side)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:33:21 +00:00
7b95e76723
test: runner vs pipeline comparison + scale assembly comparison
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:15:01 +00:00
366a0240a5
vllm tweaks
First
Previous
...
114
115
116
117
118
...
Next
Last