Files
nvfp4-megamoe-kernel/tests
biondizzle 73e03cfa6d Stage B: PV(128,64) test + v2 pipeline fixes
- test_pv64.py: (128,64) PV with separate V SMEM, single ab pipeline
  Result: cosine 0.669848 — data path works but P layout mismatch
  Softmax writes P via QK C-fragment layout, PV reads via PV A-fragment layout
  These differ for non-(128,128) PV — Bug 1 from README

- test_fmha_v2_fixed.py: KV-tile interleaved pipeline with fixes
  Fix 1: per-pipeline tx_count (Q vs KV separate byte counts)
  Fix 2: NamedBarrier for softmax-done signal (replaces double-acquire deadlock)
  Fix 3: Separate SMEM for V (no recast_ptr overlap with K)
  Still produces zeros — needs P layout fix (same root cause as test_pv64)
2026-05-21 11:49:06 +00:00
..
2026-05-21 05:08:57 +00:00
2026-05-21 05:08:57 +00:00
2026-05-21 05:08:57 +00:00
2026-05-21 05:08:57 +00:00
2026-05-21 05:08:57 +00:00
2026-05-21 05:08:57 +00:00
2026-05-21 10:50:30 +00:00
2026-05-17 22:58:27 +00:00
2026-05-17 07:37:47 +00:00