Files
nvfp4-megamoe-kernel/tests/unit
biondizzle 487d960a6a D5c multi-tile: VERIFIED cos 0.999996 with Python KV merge + sink bias
Both segments (compressed+SWA with n_comp=96, and SWA-only with n_comp=0)
pass individually at cos 0.999996. The Python KV merge produces the
correct combined attention at cos 0.999996.

Key: n_comp is compile-time, so separate kernel instances are needed
for segments with different n_comp values. Production code would use
a kernel cache keyed on (n_comp, apply_sink_bias, ...).
2026-05-26 15:40:45 +00:00
..
2026-05-23 03:25:29 +00:00
2026-05-23 03:20:46 +00:00
2026-05-24 22:23:08 +00:00
2026-05-24 22:04:51 +00:00
2026-05-24 03:48:37 +00:00
2026-05-23 23:58:57 +00:00