biondizzle

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 01:01:43 +00:00

b81200f427 Fix CuTeDSL NVFP4 linear: correct scale assembly in custom op

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 00:54:32 +00:00

e0eb436914 Fix custom_op registration: use as decorator with proper type hints

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 00:50:45 +00:00

c609e9ba3c Use torch.library.custom_op for CuTeDSL NVFP4 linear GEMM

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 00:44:47 +00:00

c043a11bcc Register CuTeDSL as proper NvFp4LinearKernel for NVFP4 linear layers

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 00:33:44 +00:00

358830925a Fix unpack error: handle both tuple and tensor returns from NVFP4 forward()

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 00:29:45 +00:00

d9dc042ff7 Fix compressor kv_score: use forward() for NVFP4 quantized weights

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 00:24:28 +00:00

10c14ddb49 Fix NVFP4 mapper: layer norms, hc params, indexer path, q_a_norm

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 00:14:10 +00:00

540e7ee8fc Fix: layer.self_attn → layer.attn (model uses attn, not self_attn)

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-19 00:10:15 +00:00

201a40e6c4 Fix zero-dim tensor concatenation in compressor scale buffer

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 23:54:03 +00:00

d41a48aa1f Fix KeyError for missing stacked params (indexer.compressor)

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 23:49:44 +00:00

4b0d8263f6 Fix NameError: use print instead of logger (not imported)

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 23:41:41 +00:00

e3c24769e2 Handle wo_a as bfloat16 (unquantized in NVFP4 checkpoint)

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 23:31:03 +00:00

9d016aa1c0 Use print instead of logger for weight load debug

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 23:28:16 +00:00

a6f61bda5d Add debug logging for weight loading failures

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 23:24:15 +00:00

eef0ef76af Fix NVFP4 compressor scale loading: buffer and concatenate scale shards

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 23:20:17 +00:00

f74447bfd0 Proper NVFP4 integration: quantized compressor/indexer + mapper fixes

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 23:03:38 +00:00

17496b2615 Fix NVFP4 weights mapper: add prefix mappings, fix substr order

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 22:53:10 +00:00

b039123207 Fix NVFP4 mapper: add attention projection renames, remove norm_gate renames

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 22:49:41 +00:00

ea648a9bc2 Fix NVFP4 mapper: keep model. prefix (model params use it)

biondizzle pushed to proper-nvfp4-integration at biondizzle/nvfp4-megamoe-kernel

2026-05-18 22:46:06 +00:00

1528d4e182 Fix NVFP4 mapper: strip model. prefix from checkpoint keys