biondizzle
d3b772196d
E3: Implement DSV4Model — full model class
- Token embedding → N×TransformerLayer → RMSNorm → lm_head
- decode_step: single token decode with mHC state management
- forward: prefill path (T tokens)
- Cache handle acquisition per layer
- mHC state initialization from embedding
- Weight loading TODO (deferred to loader/)
2026-05-30 21:15:57 +00:00
..
2026-05-21 17:30:44 +00:00
2026-05-21 23:11:09 +00:00
2026-05-30 21:15:57 +00:00
2026-05-21 23:11:09 +00:00
2026-05-21 23:31:58 +00:00
2026-05-21 17:30:44 +00:00
2026-05-21 17:30:44 +00:00