biondizzle
  • Joined on 2025-12-10
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 01:01:43 +00:00
b81200f427 Fix CuTeDSL NVFP4 linear: correct scale assembly in custom op
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 00:54:32 +00:00
e0eb436914 Fix custom_op registration: use as decorator with proper type hints
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 00:50:45 +00:00
c609e9ba3c Use torch.library.custom_op for CuTeDSL NVFP4 linear GEMM
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 00:44:47 +00:00
c043a11bcc Register CuTeDSL as proper NvFp4LinearKernel for NVFP4 linear layers
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 00:33:44 +00:00
358830925a Fix unpack error: handle both tuple and tensor returns from NVFP4 forward()
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 00:29:45 +00:00
d9dc042ff7 Fix compressor kv_score: use forward() for NVFP4 quantized weights
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 00:24:28 +00:00
10c14ddb49 Fix NVFP4 mapper: layer norms, hc params, indexer path, q_a_norm
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 00:14:10 +00:00
540e7ee8fc Fix: layer.self_attn → layer.attn (model uses attn, not self_attn)
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-19 00:10:15 +00:00
201a40e6c4 Fix zero-dim tensor concatenation in compressor scale buffer
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 23:54:03 +00:00
d41a48aa1f Fix KeyError for missing stacked params (indexer.compressor)
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 23:49:44 +00:00
4b0d8263f6 Fix NameError: use print instead of logger (not imported)
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 23:41:41 +00:00
e3c24769e2 Handle wo_a as bfloat16 (unquantized in NVFP4 checkpoint)
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 23:31:03 +00:00
9d016aa1c0 Use print instead of logger for weight load debug
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 23:28:16 +00:00
a6f61bda5d Add debug logging for weight loading failures
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 23:24:15 +00:00
eef0ef76af Fix NVFP4 compressor scale loading: buffer and concatenate scale shards
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 23:20:17 +00:00
f74447bfd0 Proper NVFP4 integration: quantized compressor/indexer + mapper fixes
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 23:03:38 +00:00
17496b2615 Fix NVFP4 weights mapper: add prefix mappings, fix substr order
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 22:53:10 +00:00
b039123207 Fix NVFP4 mapper: add attention projection renames, remove norm_gate renames
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 22:49:41 +00:00
ea648a9bc2 Fix NVFP4 mapper: keep model. prefix (model params use it)
biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel 2026-05-18 22:46:06 +00:00
1528d4e182 Fix NVFP4 mapper: strip model. prefix from checkpoint keys