|
|
a4211559cf
|
auto: pre-test commit
|
2026-05-28 16:40:51 +00:00 |
|
|
|
3b8fdcc823
|
auto: pre-test commit
|
2026-05-28 16:39:45 +00:00 |
|
|
|
072fbf0b5d
|
auto: pre-test commit
|
2026-05-28 16:36:53 +00:00 |
|
|
|
2a6d72912a
|
auto: pre-test commit
|
2026-05-28 16:28:58 +00:00 |
|
|
|
01319d7247
|
auto: pre-test commit
|
2026-05-28 15:59:22 +00:00 |
|
|
|
43516ed4ec
|
auto: pre-test commit
|
2026-05-28 15:55:59 +00:00 |
|
|
|
1ec3e1ed2c
|
auto: pre-test commit
|
2026-05-28 15:55:18 +00:00 |
|
|
|
babff1f402
|
auto: pre-test commit
|
2026-05-28 15:54:05 +00:00 |
|
|
|
2b007d2008
|
auto: pre-test commit
|
2026-05-28 15:53:39 +00:00 |
|
|
|
84b997881f
|
auto: pre-test commit
|
2026-05-28 15:53:04 +00:00 |
|
|
|
6e5401df3b
|
auto: pre-test commit
|
2026-05-28 15:51:55 +00:00 |
|
|
|
102174fade
|
auto: pre-test commit
|
2026-05-28 15:50:52 +00:00 |
|
|
|
2dcfc0089f
|
auto: pre-test commit
|
2026-05-28 15:49:47 +00:00 |
|
|
|
1cdb90462f
|
auto: pre-test commit
|
2026-05-28 15:48:15 +00:00 |
|
|
|
80fd612132
|
auto: pre-test commit
|
2026-05-28 15:47:58 +00:00 |
|
|
|
9583cbc67a
|
auto: pre-test commit
|
2026-05-28 15:46:53 +00:00 |
|
|
|
1b86860c19
|
auto: pre-test commit
|
2026-05-28 15:46:16 +00:00 |
|
|
|
6249989cf6
|
Clean up HD=64 test, V layout verified correct
|
2026-05-28 15:21:33 +00:00 |
|
|
|
e1daad6955
|
Verify V SMEM values vs GMEM for HD=64
|
2026-05-28 15:19:31 +00:00 |
|
|
|
bafd26707b
|
FMHA HD=64 with BLOCK_MN_B=16, 4 N-tiles per K-tile
|
2026-05-28 15:17:40 +00:00 |
|
|
|
6b9b06647a
|
Clean up HD=64 debug prints, keep register-math PV check
|
2026-05-28 15:15:22 +00:00 |
|
|
|
5c9d471162
|
Add register-math PV reference for HD=64 debug
|
2026-05-28 15:13:47 +00:00 |
|
|
|
43e9efbc2b
|
Fix string literal
|
2026-05-28 15:12:20 +00:00 |
|
|
|
906be7ce50
|
Add filtered cosine (exclude near-zero)
|
2026-05-28 15:11:14 +00:00 |
|
|
|
40c83c769a
|
Fix: remove ×2 QK scale correction (MMA scale is 1.0, not 0.5)
|
2026-05-28 15:09:57 +00:00 |
|
|
|
6ea7356fdd
|
Debug: print P values for HD=64
|
2026-05-28 15:07:55 +00:00 |
|
|
|
4b052f22a5
|
Fix: opt into >48KB shared memory for HD=64
|
2026-05-28 15:06:37 +00:00 |
|
|
|
7becbfc07e
|
Fix: printf after var declarations
|
2026-05-28 15:03:25 +00:00 |
|
|
|
2d44f8e356
|
Debug: check if HD=64 kernel starts
|
2026-05-28 15:02:00 +00:00 |
|
|
|
46e4d07c71
|
Test PV SS MMA with B=(64,16) BLOCK_MN=64
|
2026-05-28 14:58:10 +00:00 |
|
|
|
465e089a2b
|
Add launch error check for HD=64
|
2026-05-28 14:56:07 +00:00 |
|
|
|
2fd64c464d
|
FMHA HD=64 with BLOCK_MN_B=64 for V, proper output dimensions
|
2026-05-28 14:54:10 +00:00 |
|
|
|
15ecc1f616
|
Full FMHA HD=64 with PV SS MMA (SMEM-P)
|
2026-05-28 14:52:29 +00:00 |
|
|
|
5b2e690936
|
Milestone: Full FMHA HD=16 with PV SS MMA (SMEM-P) — cosine 0.9997
|
2026-05-28 14:50:43 +00:00 |
|
|
|
78026839b7
|
Fix V canonical layout: swap g_mn/g_k indices (d=MN, lr=K)
|
2026-05-28 14:49:17 +00:00 |
|
|
|
9a3b43c42b
|
Fix reference to also use uniform P
|
2026-05-28 14:47:10 +00:00 |
|
|
|
75bdcbf728
|
Debug: override P with uniform 1/128
|
2026-05-28 14:46:21 +00:00 |
|
|
|
af93c283c7
|
Enable all 8 PV K-tiles
|
2026-05-28 14:45:13 +00:00 |
|
|
|
6f5be8a4e4
|
Debug: print P values
|
2026-05-28 14:44:09 +00:00 |
|
|
|
3d15f5bb21
|
Debug: 1 PV K-tile
|
2026-05-28 14:43:01 +00:00 |
|
|
|
284a06ddf1
|
FMHA v5: clean rewrite with QK + softmax + PV SS per K-tile
|
2026-05-28 14:42:13 +00:00 |
|
|
|
342193e0b4
|
Fix tb scope
|
2026-05-28 14:40:55 +00:00 |
|
|
|
a6f7ef7c45
|
Add softmax read from TMEM
|
2026-05-28 14:40:35 +00:00 |
|
|
|
38b0ff0bf8
|
Add QK GEMM to minimal PV test
|
2026-05-28 14:39:51 +00:00 |
|
|
|
e9f8f9e6e3
|
Minimal PV with s_p_vals in SMEM
|
2026-05-28 14:38:58 +00:00 |
|
|
|
97ebb964a2
|
Move s_p_vals to dynamic SMEM
|
2026-05-28 14:38:03 +00:00 |
|
|
|
d2387dd858
|
Full FMHA v4: per-K-tile P fill into reusable (128,16) buffer
|
2026-05-28 14:37:11 +00:00 |
|
|
|
78b470317f
|
PV accumulation debug with detailed TMEM read
|
2026-05-28 14:35:29 +00:00 |
|
|
|
dacbf53081
|
Test K-tiles 0-1 accumulated
|
2026-05-28 14:33:31 +00:00 |
|
|
|
bad31d9476
|
Test K-tile 1
|
2026-05-28 14:32:51 +00:00 |
|