This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 03:54:31 +00:00
a53936a17c
diag: print l1_out shape warning in shared expert
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 03:50:56 +00:00
db30c4acd6
auto: pre-test push for test_se_gpu.py
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 03:43:11 +00:00
3dd95ce77b
fix: set activation global scales AFTER _ensure_stacked/_ensure_initialized (which override them)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 03:31:37 +00:00
27c63b01d6
diag: remove broken SE reference comparison, add gsa/gsb print
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 03:25:31 +00:00
9a27ed21fd
diag: compare shared expert output with PyTorch reference
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 03:16:27 +00:00
ee8318ad58
diag: handle NaN in shared expert output print
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 03:09:14 +00:00
7000762309
diag: fix SE weight attribute name
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 03:03:07 +00:00
fba1c06cad
diag: check SE weight integrity
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 02:56:31 +00:00
22d7cc9b7a
diag: cuda sync check after shared expert for first 3 layers
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 02:49:58 +00:00
b85fcf4d6f
diag: print SE global scales for first 3 layers
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 02:41:14 +00:00
48d93a6d2e
diag: MoE input/output diagnostics for first 3 layers
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 02:34:29 +00:00
856a459a98
fix: init l1_gsa_list and l2_gsa_list
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 02:31:14 +00:00
66b98e5794
fix: MoE and shared expert global scale — gsb=ws2, gsa=input_scale (same bug as Nvfp4Linear)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 02:19:36 +00:00
f4b444b456
fix: NVFP4 global scale bug — gsb=weight_scale_2 (not input_scale*ws2), gsa=input_scale
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 02:12:41 +00:00
1eed28dd09
diag: compare production FMHA and NVFP4 linear output with PyTorch reference
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 02:02:16 +00:00
df394f8b40
fix: missing closing quote on string literal
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 01:58:46 +00:00
cfd2468c61
fix: decode loop also needs int32 token_ids for hash router
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 01:49:41 +00:00
905623793b
fix: move token_ids to same GPU as router (was cuda:0 but router on cuda:N)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 01:41:05 +00:00
7804b779ce
diag: print wo_a g_flat magnitude to find where zeros come from
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 01:34:04 +00:00
efe63caea9
diag: print FMHA output magnitude for first 3 layers
First
Previous
...
19
20
21
22
23
...
Next
Last