This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 11:13:34 +00:00
8758bc93ca
crap shoot
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 02:20:17 +00:00
b8df4a8cc5
Fix NaN check: use os.environ gate instead of is_current_stream_capturing
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 23:37:13 +00:00
0c02d84514
Add NaN/Inf detection in DeepseekV4Model.forward layer loop
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 23:04:45 +00:00
bedcfc4dab
Pipeline test: use max_num_tokens=8192 matching vLLM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 22:58:28 +00:00
c45364b3a8
Add MoE scale ratio output
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 22:56:57 +00:00
bf99ad49ec
Print both MoE and residual cosine
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 22:55:42 +00:00
8637020487
Fix multi-layer test: add residual connections
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 22:53:30 +00:00
11dce13afe
Add multi-layer pipeline test to check error accumulation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 22:28:35 +00:00
87582fc9f7
HOTFIX: remove NaN checks from run() — torch.isnan().any() does CPU-GPU sync, breaks cudagraph
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 22:03:49 +00:00
8717e0e411
Fix warmup: use same padded GEMM path as run(), add swiglu_limit clamping
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 22:02:26 +00:00
d332f4f900
Add NaN debug checks after L1 and L2 GEMM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 21:36:26 +00:00
e65f2b2ba2
Update CURRENT_BUG.md with Bug 26 fix
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 21:29:17 +00:00
72628fb689
Full pipeline test: runner vs BF16 reference
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 21:28:05 +00:00
2796bd81e8
Fix: scatter FP4 as uint8 (float4 doesn't support index_put)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 21:26:48 +00:00
364f8372bb
Fix FP4 buffer shapes: D//2 for packed dimensions
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 21:25:59 +00:00
5e4d674736
Test fix: quantize slot_hidden, scatter FP4, pass slot_x_sf
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 21:25:06 +00:00
803e7160d8
Fix: allocate FP4 buffers as uint8 then view-cast
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 21:24:45 +00:00
7256070dd3
FIX Bug 26: quantize slot tokens, not padded buffer
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 21:22:52 +00:00
4d0b6d889d
Set runner weights before _ensure_stacked
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 21:22:32 +00:00
b7acac5e4e
Call _ensure_stacked() before using runner buffers
First
Previous
...
110
111
112
113
114
...
Next
Last