This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 00:18:57 +00:00
b1dd59293a
Add prefill: process prompt tokens to fill KV cache before decoding
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 00:15:00 +00:00
178fb5483a
Fix KV cache: use index 0 (one-layer cache per layer instance)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 00:11:17 +00:00
afcc690ddc
Add full MoE routing + KV cache to single_shot
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-31 00:02:31 +00:00
3ecfbcba57
Fix T scope in post_block
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 23:59:20 +00:00
a493f72681
Add per-residual RMSNorm in mHC post_block (routed MoE missing)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 23:55:36 +00:00
49282fe206
Fix mHC: match vLLM torch reference exactly
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 23:48:34 +00:00
66a66f8244
Add per-layer NaN tracking for mHC debug
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 23:45:20 +00:00
d003c4b7cc
Add mHC (Manifold-Constrained Hyper-Connections) to single_shot
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 23:39:47 +00:00
f567c20539
Fix: set active CUDA device per layer for BMM/FMHA
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 23:36:16 +00:00
7a95983e0f
Rewrite single_shot: 8-GPU pipeline parallel
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:59:30 +00:00
aac0fa1f08
Update STATUS.md + MEMORY.md: single-shot inference verified
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:58:51 +00:00
11c010e567
Update output section: kernel verified, architecture gaps noted
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:56:22 +00:00
53178d2536
Add emergency RMSNorm after residuals (missing mHC fallback)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:54:58 +00:00
172ba75e0c
Add per-layer NaN check to track where values diverge
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:53:11 +00:00
ec7846e28c
Add NaN tracking to single_shot_inference
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:51:12 +00:00
5fa6c88b17
Fix: replace FP4 Inf with 24 (avoid NaN in dequant)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:49:23 +00:00
904753f62a
Fix: BMM batch dim alignment for wo_a
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:48:32 +00:00
52df3bc26c
Fix: wo_a as batched matmul (grouped linear for output projection)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:46:14 +00:00
19240608d7
Fix: handle o_a_proj grouped linear shape mismatch
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-30 22:45:15 +00:00
1d02758416
Fix: kv_proj outputs hd=512 (1 KV head MQA), Z from compressor.gate_proj
First
Previous
...
27
28
29
30
31
...
Next
Last