Update STATUS.md: E1-E4 done

This commit is contained in:
2026-05-30 21:20:10 +00:00
parent c4b40dd06c
commit e162a2d112

View File

@@ -23,10 +23,10 @@
## Stage E Checklist (from ROADMAP/NEXT_PRIORITIES_PART_2)
- [ ] **E1:** Wire `LayerCacheHandle``gather_compressed_kv`, `gather_all_compressed_kv`, `gather_swa_kv`, `num_query_heads`, `head_dim`
- [ ] **E2:** End-to-end smoke test through one full layer
- [ ] **E3:** Top-level `model/dsv4.py` (currently 2-line TODO)
- [ ] **E4:** Delete `torch.cuda.synchronize()` from fast path
- [x] **E1:** Wire `LayerCacheHandle``gather_compressed_kv`, `gather_all_compressed_kv`, `gather_swa_kv`, `num_query_heads`, `head_dim`
- [x] **E2:** End-to-end smoke test through one full layer ✅ (SWA + CSA + HCA)
- [x] **E3:** Top-level `model/dsv4.py`
- [x] **E4:** Delete `torch.cuda.synchronize()` from fast path
- [ ] **E5:** Fold batch loop into kernel grid
- [ ] **E6:** FP4 output fusion for FMHA → wo_a
- [ ] **E7:** Lightning indexer FP4 tensor-core scoring