|
|
9264023e3b
|
Update STAGE_D.md with D5b results: merge cos 0.961, LSE err=0.0
|
2026-05-23 21:45:22 +00:00 |
|
|
|
2883e042ca
|
D5b MILESTONE: SWA+sink merge works! cos 0.969
- Run FMHA twice (compressed KV + SWA KV) with normalized O + LSE
- Merge with sink weights in Python
- LSE err=0.0, merge cos=0.969 PASS
- Update STAGE_D.md: D5b done, D5c/D5d are optimizations
|
2026-05-23 21:36:26 +00:00 |
|
|
|
60e03fe84a
|
Update STAGE_D.md: D5a done, CG-2/CG-3 status updated, tOrP0 offset rule added
|
2026-05-23 21:16:52 +00:00 |
|
|
|
5b6a4fbef9
|
Update STAGE_D.md: manual SMEM addressing blocked on layout mapping
|
2026-05-23 19:22:28 +00:00 |
|
|
|
56bed1066d
|
auto: pre-test commit
|
2026-05-23 19:20:42 +00:00 |
|
|
|
944fa9b155
|
auto: pre-test commit
|
2026-05-23 19:14:02 +00:00 |
|
|
|
01fd6d03db
|
Update STAGE_D.md with current action plan - starting NVFP4-0 verification and D1.3 validation on B200
|
2026-05-23 19:09:56 +00:00 |
|
|
|
5756b6e4ec
|
📋 Update STAGE_D.md: D1.3 ✅ SOLVED, D1.4 ✅ IMPLEMENTED, D1.5 🟡 complex refactor, checklist updated
|
2026-05-23 18:37:53 +00:00 |
|
|
|
593584fc8d
|
🎉 Mark D1.3 as SOLVED! SMEM-P rank mismatch fixed, enables hd>64 support
|
2026-05-23 18:26:15 +00:00 |
|
|
|
fe3b1abf22
|
Update STAGE_D.md checklist with current progress and lessons learned
|
2026-05-23 09:27:48 +00:00 |
|
|
|
d9780c0a0c
|
docs: add NVFP4 precision roadmap to STAGE_D.md (3 honest buckets + speculative bucket)
|
2026-05-23 07:39:09 +00:00 |
|
|
|
74d0822214
|
shit carmine left dangling
|
2026-05-23 06:55:22 +00:00 |
|
|
|
3b167a4362
|
D1.2: TMEM budget verified on B200. Split-PV mandatory at hd=512 (MMA max N=256)
|
2026-05-23 06:43:01 +00:00 |
|
|
|
11c7e2c663
|
STAGE_D.md: restructure with correctness gaps, TMEM budget, execution order
|
2026-05-23 06:31:37 +00:00 |
|
|
|
e98f5e4f9e
|
Add STAGE_D.md: step-by-step runbook and todo list for D1-D5
|
2026-05-23 05:52:03 +00:00 |
|