This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 19:03:58 +00:00
3c1a76bdcc
Fix Dockerfile: use external patch script instead of inline Python
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 18:35:36 +00:00
75844a8361
Post-quant fix via Dockerfile patch to process_weights_after_loading
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 18:15:38 +00:00
a4ad5898c1
Fix post-quant hook: register on inner model, fix module refs
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 17:56:21 +00:00
a51edd238e
Add post-quant-init forward hook to fix attention NVFP4
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 16:43:46 +00:00
2835cb040b
Fix input_scale BEFORE process_weights_after_loading runs
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 16:23:42 +00:00
2fc81ccac4
Revert to BF16 dequant for attention NVFP4 (input_scale fix was too early)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 16:00:01 +00:00
4a57399592
Add debug prints for input_global_scale_inv check
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 15:43:48 +00:00
f86892e26b
Replace BF16 dequant with input_scale warmup fix for attention NVFP4
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 15:22:54 +00:00
301015b037
Remove all inline diagnostics — incompatible with torch.compile
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 15:05:54 +00:00
a83d364d45
Switch to cudagraph_mode=NONE (not enforce-eager) for real inference testing
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 14:45:44 +00:00
2a2a42c6d6
Add attention-internal diagnostics: MLA output, FP8 quant output
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 14:24:15 +00:00
5c1dda10f6
Add granular attention diagnostics: pre/post attn, embed, dequant stats
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 14:04:13 +00:00
e0e0528778
Add debug logging for BF16 dequant to find missing attrs
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 13:47:09 +00:00
2e8c3c961f
Fix: dequantize fused_wqa_wkv instead of separate wq_a/wkv
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 13:22:18 +00:00
a7216b27df
Fix: keep wo_a as FP8 (fp8_einsum path), dequant others to BF16
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 13:09:38 +00:00
334e95047e
Fix: dequantize ALL attention NVFP4 projections to BF16
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 12:54:17 +00:00
a83c332059
Fix docker-compose: remove orphaned compilation-config arg, enforce-eager mode
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 12:51:51 +00:00
9e7639fba4
Add layer-by-layer diagnostic prints (CLAWMINE_DEBUG=1, enforce-eager)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 12:17:28 +00:00
2d1e9f42b1
Remove NaN check — incompatible with Dynamo fullgraph compilation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-18 11:33:31 +00:00
65763a200c
Fix NaN check: wrap in @torch.compiler.disable to prevent Dynamo graph break
First
Previous
...
109
110
111
112
113
...
Next
Last