This website requires JavaScript.
Explore
Help
Register
Sign In
biondizzle
0 Followers
·
0 Following
Joined on
2025-12-10
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
25
Projects
Packages
Public Activity
Starred Repositories
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 07:10:20 +00:00
34c43958d0
vllm tweaks
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 06:24:17 +00:00
48e4cb625d
fix: default activation global_scale so runner works without finalize_weights
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 06:22:28 +00:00
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 06:06:15 +00:00
b497b35a10
fix: dynamic activation quantization (quantize_to_nvfp4) + per-expert scale assembly
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 05:55:47 +00:00
78bebff736
test: standalone CuTeDSL GEMM diagnostic
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-17 03:35:22 +00:00
d2965b432d
fix: set _l1_activation_global_scale (with underscore) — attribute name mismatch
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 22:49:32 +00:00
b382a7a528
fix: handle input_scale as 1D or 2D (EP splits change the shape)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 22:23:33 +00:00
139c9c37cd
fix: read input_scale from nn.Parameter before it's freed
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 21:46:01 +00:00
152648789d
fix: use checkpoint input_scale for activation global scale (not hardcoded 1/2688)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 21:41:02 +00:00
af087e655e
docs: update README — vLLM cudagraph inference running, output quality in progress
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:45:47 +00:00
0a5cfe0433
add kernel compile caching — compile once, invoke on subsequent calls
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:42:58 +00:00
3465b9d471
remove torch.cuda.synchronize() from run_nvfp4_grouped_gemm (cudagraph-safe)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:40:19 +00:00
5e245bc0c6
fix: missing newline
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:39:40 +00:00
288e179f88
add quantize_activation_nvfp4 (cudagraph-safe, fixed global scale)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:37:44 +00:00
521e11e468
test: old bridge + LUT quantization only (step 1 of cudagraph migration)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:34:47 +00:00
f51be76e8f
temp: restore EXACT old bridge.py from
b685112
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:28:18 +00:00
58dc36e21c
fix: compile fresh each call — cached compile produces wrong TMA descriptors
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:25:19 +00:00
98cc6ac1f3
fix: invert cache check logic (compile when NOT in cache)
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:24:05 +00:00
e337ec86a3
debug: test with cache enabled
biondizzle
pushed to
master
at
biondizzle/nvfp4-megamoe-kernel
2026-05-16 20:22:57 +00:00
bc56452be8
debug: disable kernel cache to test fresh compilation
First
Previous
...
115
116
117
118
119
...
Next
Last