This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 15:04:49 +00:00
9d57b0453b
auto: pre-test commit
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 15:04:04 +00:00
1a6d9ee29b
Reset to greedy decoding (temperature=0)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 15:03:58 +00:00
038fe81c68
Fix MoE non-fused L2 runtime gsa + update test harness for extra args
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 14:55:43 +00:00
a48d6e14ae
Default temperature=0.7 with rep penalty
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 14:54:50 +00:00
1d64b863ca
Add temperature sampling + repetition penalty to fix degenerate repetition
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 14:43:51 +00:00
6cca16f97a
Set max-tokens=128 default, clean up for final verification
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 14:33:58 +00:00
a0e758ec3b
Set default max-tokens=30 for faster iteration
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 14:21:18 +00:00
2b1fca6dae
CRITICAL FIX: runtime activation global scale to prevent E4M3 overflow
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 14:15:29 +00:00
3b2714410f
Add NVFP4 linear accuracy test: prod vs ref with all-ones input
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 14:11:49 +00:00
3e47d5f20a
Add prod vs ref GEMM comparison test + gate logits diagnostic
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 13:55:56 +00:00
ad143afe37
Add L58-60 diagnostic: mHC A/B/C, MoE routed/shared, topk
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 11:25:53 +00:00
7a05d3d3af
NVFP4 router gate: use Nvfp4Linear for both checkpoint and quantized paths
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 11:17:55 +00:00
e5dbe1ed22
Switch router to Nvfp4Linear production GEMM (custom CuTeDSL kernel crashes MLIR)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 11:14:06 +00:00
a4324781c3
Fix: properly remove sqrt(softplus) from CuTeDSL kernel
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 11:12:43 +00:00
6efe90cd85
Move sqrt(softplus) out of CuTeDSL kernel into Python
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 11:08:08 +00:00
fbc1e883f2
Add try/except around fused NVFP4 gate loading with error reporting
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 11:05:10 +00:00
5f38430423
Fix: use 1-dim tensors for gate_ws2 and gate_input_scale
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 11:03:11 +00:00
ec8f292112
Fix: use self.mma_tiler_mnk (full K=64) for SMEM layout computation
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 10:55:46 +00:00
44fb9b6c00
Fix: pass self.mma_tiler_mnk (full K) to _compute_stages, not self.mma_tiler (K=1 placeholder)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 10:49:07 +00:00
be2bb2fe84
Fix: self.mma_tiler_mnk not mma_tiler_mnk
First
Previous
...
14
15
16
17
18
...
Next
Last