biondizzle
a4ef6c3454
Add B1 mixed FP8 prefill FMHA kernel (T>1 support)
New files:
- fmha_mixed_fp8_prefill.cuh: kernel supporting T=1..128
- Sub-batch processing (T_BATCH=32) to fit in 232KB SMEM
- Multi-row QK TMEM read using tcgen05.ld.32x32b.x8
- Per-row online softmax
- Per-row PV MMA (correctness first; batched PV is TODO)
- Attention sink support
- fmha_mixed_fp8_prefill_capi.cu: C API bridge
- fmha_mixed_fp8_prefill_op.py: Python ctypes loader
- test_b1_mixed_fp8_prefill.py: unit test (T=1..32, N=128..4096)
Also: fix production FMHA layer test (BF16 fallback for o_a_proj,
router gate BF16 quantize path, missing DEVICE constant)
2026-06-03 02:50:27 +00:00
..
2026-05-29 06:52:39 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00
2026-05-30 21:08:12 +00:00
2026-06-03 00:25:53 +00:00
2026-06-03 02:50:27 +00:00
2026-06-03 00:21:29 +00:00
2026-06-02 19:24:39 +00:00
2026-06-01 09:11:29 +00:00
2026-05-21 17:30:44 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 19:32:35 +00:00
2026-05-28 19:32:35 +00:00
2026-05-28 19:32:35 +00:00
2026-05-28 19:32:35 +00:00
2026-05-30 10:28:38 +00:00
2026-05-28 20:04:29 +00:00
2026-05-28 20:04:29 +00:00
2026-05-28 20:04:29 +00:00
2026-05-28 20:04:29 +00:00
2026-05-28 23:57:31 +00:00
2026-05-29 19:32:49 +00:00
2026-05-29 19:32:49 +00:00
2026-05-30 03:45:05 +00:00
2026-05-29 19:40:32 +00:00
2026-05-29 19:40:32 +00:00
2026-05-30 03:45:05 +00:00
2026-05-30 04:49:33 +00:00
2026-05-30 04:49:33 +00:00
2026-05-30 06:56:09 +00:00
2026-05-30 07:02:41 +00:00
2026-05-30 03:20:49 +00:00
2026-05-29 20:02:00 +00:00
2026-05-29 20:02:00 +00:00
2026-05-30 03:45:05 +00:00
2026-05-29 22:46:21 +00:00
2026-05-30 03:46:38 +00:00
2026-05-28 16:28:58 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 13:08:06 +00:00
2026-05-28 15:46:53 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:55:59 +00:00
2026-05-28 15:17:40 +00:00
2026-05-28 15:21:33 +00:00
2026-05-28 13:10:02 +00:00
2026-05-28 15:59:22 +00:00
2026-05-28 15:59:22 +00:00
2026-06-03 00:07:38 +00:00
2026-06-03 00:09:36 +00:00
2026-05-31 23:10:13 +00:00
2026-05-28 07:49:48 +00:00
2026-05-28 06:41:59 +00:00
2026-05-28 14:28:24 +00:00
2026-05-28 13:00:37 +00:00
2026-05-29 18:50:58 +00:00
2026-05-28 14:16:43 +00:00
2026-05-28 13:47:47 +00:00
2026-05-23 02:54:54 +00:00
2026-05-22 23:35:55 +00:00
2026-05-28 14:38:03 +00:00
2026-05-28 14:50:43 +00:00
2026-05-21 17:30:44 +00:00
2026-06-02 16:39:42 +00:00
2026-06-02 16:34:49 +00:00
2026-06-02 08:24:32 +00:00
2026-06-02 08:43:40 +00:00
2026-05-28 04:02:45 +00:00
2026-06-02 09:37:53 +00:00
2026-06-02 09:43:45 +00:00
2026-06-02 10:08:43 +00:00
2026-06-02 19:24:39 +00:00
2026-06-02 19:24:39 +00:00
2026-06-02 10:46:28 +00:00
2026-05-28 14:40:55 +00:00
2026-05-28 14:10:07 +00:00
2026-05-28 14:19:45 +00:00
2026-05-28 03:39:55 +00:00
2026-05-28 04:32:08 +00:00
2026-06-01 07:55:29 +00:00
2026-05-25 03:07:53 +00:00
2026-05-25 16:24:04 +00:00
2026-06-01 14:15:27 +00:00
2026-05-23 08:45:26 +00:00
2026-05-25 09:08:01 +00:00
2026-05-25 03:17:13 +00:00
2026-06-02 04:31:18 +00:00
2026-05-30 17:23:13 +00:00
2026-05-30 17:25:01 +00:00
2026-05-30 17:17:54 +00:00
2026-05-23 03:32:53 +00:00
2026-06-01 14:11:37 +00:00
2026-06-01 05:19:13 +00:00
2026-06-03 02:47:47 +00:00
2026-05-30 21:08:12 +00:00
2026-05-28 14:35:29 +00:00
2026-05-29 19:08:39 +00:00
2026-05-28 14:33:31 +00:00
2026-05-28 14:58:10 +00:00
2026-05-28 14:24:53 +00:00
2026-05-29 18:30:52 +00:00
2026-05-29 18:35:00 +00:00
2026-05-29 18:42:03 +00:00
2026-05-28 08:53:35 +00:00
2026-05-28 14:17:37 +00:00
2026-05-29 19:30:50 +00:00
2026-05-29 18:29:49 +00:00
2026-06-02 09:05:22 +00:00
2026-05-28 14:18:39 +00:00
2026-05-28 14:10:07 +00:00
2026-05-28 16:36:53 +00:00
2026-05-28 19:18:01 +00:00
2026-05-28 17:00:20 +00:00
2026-05-28 16:58:30 +00:00
2026-05-29 04:45:06 +00:00
2026-05-29 04:45:54 +00:00
2026-05-29 04:43:24 +00:00
2026-05-28 16:51:40 +00:00
2026-05-29 18:46:09 +00:00
2026-05-29 18:45:01 +00:00
2026-05-29 18:48:39 +00:00
2026-05-28 16:39:45 +00:00
2026-05-29 19:26:09 +00:00
2026-05-28 16:42:24 +00:00
2026-05-29 19:29:35 +00:00
2026-05-29 19:28:23 +00:00
2026-05-29 19:27:30 +00:00
2026-05-29 18:27:07 +00:00
2026-05-28 23:06:07 +00:00
2026-05-28 15:51:55 +00:00
2026-05-28 09:59:43 +00:00
2026-05-28 07:42:16 +00:00
2026-05-28 15:49:47 +00:00
2026-05-28 15:48:15 +00:00
2026-05-28 07:12:26 +00:00
2026-05-28 22:56:29 +00:00
2026-05-28 15:54:05 +00:00
2026-05-28 12:57:38 +00:00
2026-05-28 11:39:15 +00:00
2026-05-29 22:42:46 +00:00