Files
nvfp4-megamoe-kernel/dsv4/kernels/attention
biondizzle c55030a340 P5: clean kernel with runtime branch (single-tile unchanged, multi-tile separate path)
Single-tile path is IDENTICAL to the working pre-P5 kernel.
Multi-tile path uses FA2 online softmax with sOacc accumulator.
Runtime branch on is_multi_tile = (n_kv_tiles > 1).
2026-05-30 08:57:00 +00:00
..
2026-05-29 18:22:26 +00:00