This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 10:42:36 +00:00
c082843ecc
Fix: mma_tiler K=1 placeholder in __init__, refined in _setup_attributes
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 10:37:18 +00:00
e0f60b9f05
Fix fused router: plain ints for mma_tiler + @cute.jit pattern
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 10:28:02 +00:00
057ae2101e
CRITICAL FIX: Move tiled_mma creation and _setup_attributes OUTSIDE @cute.jit
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 10:14:31 +00:00
71deeb91a9
Quantize BF16 gate weight to NVFP4 for fused router + add global scales to GEMM
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 10:02:51 +00:00
24fed15ed6
Fix: convert PyTorch tensors to CuTe tensors for fused router kernel
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:59:36 +00:00
bab748763e
Rewrite NVFP4 fused router kernel: MoE-style epilogue replaces broken SMEM merge
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:47:51 +00:00
31ebe4f2db
Wire NVFP4 fused router kernel into e2e single-shot pipeline
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:42:19 +00:00
d9d3ca42b0
Fix: mma_tiler and cluster_layout must use MLIR values for cute.slice_
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:38:10 +00:00
ec79f30709
Fix: PersistentTileSchedulerParams cluster_shape must be Python ints not MLIR values
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:35:07 +00:00
28d0cb4f41
Revert cutlass.Int32 wrapping — now inside @cute.jit, cute.round_up works
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:32:08 +00:00
b536f99192
CRITICAL FIX: move ALL CuTe DSL setup inside @cute.jit context
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:30:16 +00:00
65669596d4
Fix: all CuTe shape values must be cutlass.Int32 for MLIR compatibility
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:22:25 +00:00
df48dacc2b
Fix: set mma_inst_shape_mn in __init__ before _create_tiled_mma call
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:21:06 +00:00
28f78420c2
Fix: quantize_activation_nvfp4 API - correct signature and return values
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:19:49 +00:00
7b3f6cb13c
Fix fused router: use run_nvfp4_fused_router wrapper, correct CuTe tensor API
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:17:05 +00:00
483e759d53
Fix: use tensor.mark_layout_dynamic() method (not cute.mark_layout_dynamic)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:15:25 +00:00
2412745b21
Test fix: slice NVFP4 logits to actual expert count (GEMM padding)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:13:55 +00:00
f33ca41c2a
Fused router: replace nested if/else top-k with flat find-min-replace approach
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 09:11:32 +00:00
4f4ae8febd
Test: enumerate CuTeDSL math API to check available operations
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-06-01 08:56:22 +00:00
9b86b2b414
Test: fix fused router test - proper NVFP4 quantization and CuTe tensor setup
First
Previous
...
15
16
17
18
19
...
Next
Last