This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 11:20:54 +00:00
88719f39b4
Add single-layer trace (Phase 2.6) for detailed debugging
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 11:10:39 +00:00
8256e23aed
Fix mHCContext attribute access (not tuple unpacking) and enable attention diag
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 11:07:24 +00:00
72c139a59f
Enable MHC_DIAG for diagnostic run
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 11:07:19 +00:00
cd661c2e40
Add attention and Q/KV diagnostics (MHC_DIAG flag)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 10:54:42 +00:00
9584fcbc23
Fix top5_ids variable name in decode logging
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 10:49:31 +00:00
a6d56d10ca
Add top-20 logging and thinking token detection in decode loop
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 10:33:44 +00:00
d891ae7e96
Fix prompt format: use DeepSeek V4 chat tokens
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 10:28:26 +00:00
f86742ef8e
Cache layer weights on GPU — eliminates per-token CPU→GPU transfer
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 10:07:16 +00:00
ce3d6069cc
CRITICAL FIX: mHC base/scale ordering matches fn ordering [pre, res, post]
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 10:02:59 +00:00
9a43e9aa77
CRITICAL FIX: mHC fn weight row ordering was wrong
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 09:56:20 +00:00
0346e479d4
Add system prompt, CLI args, inverse RoPE flag, minimal e2e test
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 09:23:12 +00:00
429fc3db40
Fix expert weight indexing for 1D tensor
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 09:22:28 +00:00
33004dcbf4
Fix expert weight broadcasting (wt.item() for scalar multiply)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 09:21:52 +00:00
1434b35971
Add residual diagnostic test — per-layer magnitude tracking
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 09:17:38 +00:00
1c18c16c68
Fix production rope.py: FP32 arithmetic for forward_rope_partial + inverse_rope_bf16
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 09:17:10 +00:00
970869d017
Fix mHCBlock import + relax RoPE round-trip threshold (BF16 noise expected)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 09:16:00 +00:00
a2ee78b564
Fix RoPE shape bug (interleave needs separate even/odd assembly)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 09:15:01 +00:00
9d96c2fbbf
CRITICAL FIX: FP32 RoPE cache + FP32 arithmetic for inverse RoPE round-trip
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 09:14:05 +00:00
db74a887ab
Add minimal e2e test + fix MoE expert loop bug (indentation)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 07:02:40 +00:00
e195d9d3a7
add SKIP_ROUTED_MOE debug flag, re-enable sinks
First
Previous
...
24
25
26
27
28
...
Next
Last