From e162a2d1122e0c485161c98ec642956e9fa30d10 Mon Sep 17 00:00:00 2001 From: biondizzle Date: Sat, 30 May 2026 21:20:10 +0000 Subject: [PATCH] Update STATUS.md: E1-E4 done --- STATUS.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/STATUS.md b/STATUS.md index bd5ee313..1ee45a40 100644 --- a/STATUS.md +++ b/STATUS.md @@ -23,10 +23,10 @@ ## Stage E Checklist (from ROADMAP/NEXT_PRIORITIES_PART_2) -- [ ] **E1:** Wire `LayerCacheHandle` → `gather_compressed_kv`, `gather_all_compressed_kv`, `gather_swa_kv`, `num_query_heads`, `head_dim` -- [ ] **E2:** End-to-end smoke test through one full layer -- [ ] **E3:** Top-level `model/dsv4.py` (currently 2-line TODO) -- [ ] **E4:** Delete `torch.cuda.synchronize()` from fast path +- [x] **E1:** Wire `LayerCacheHandle` → `gather_compressed_kv`, `gather_all_compressed_kv`, `gather_swa_kv`, `num_query_heads`, `head_dim` ✅ +- [x] **E2:** End-to-end smoke test through one full layer ✅ (SWA + CSA + HCA) +- [x] **E3:** Top-level `model/dsv4.py` ✅ +- [x] **E4:** Delete `torch.cuda.synchronize()` from fast path ✅ - [ ] **E5:** Fold batch loop into kernel grid - [ ] **E6:** FP4 output fusion for FMHA → wo_a - [ ] **E7:** Lightning indexer FP4 tensor-core scoring