From c1f72cee8b00bce64f078d4b6f392717e31a2fa5 Mon Sep 17 00:00:00 2001 From: biondizzle Date: Sun, 15 Feb 2026 15:30:24 -0500 Subject: [PATCH] refresh todo --- TODO.md | 281 +++++++++++++++++++------------------------------------- 1 file changed, 94 insertions(+), 187 deletions(-) diff --git a/TODO.md b/TODO.md index bdd9cf1..25d42a4 100644 --- a/TODO.md +++ b/TODO.md @@ -1,209 +1,116 @@ -# JormunDB Implementation TODO +# JormunDB (Odin rewrite) — TODO -This tracks the rewrite from Zig to Odin and remaining features. +This tracks the rewrite from Zig (ZynamoDB) → Odin (JormunDB), and what’s left to stabilize + extend. -## ✅ Completed +## Status Snapshot -- [x] Project structure -- [x] Makefile with build/run/test targets -- [x] README with usage instructions -- [x] ARCHITECTURE documentation -- [x] RocksDB FFI bindings (rocksdb/rocksdb.odin) -- [x] Core types (dynamodb/types.odin) -- [x] Key codec with varint encoding (key_codec/key_codec.odin) -- [x] Main entry point with arena pattern demo -- [x] .gitignore -- [x] HTTP Server Scaffolding -- [x] JSON Parser -- [x] Item_codec -- [x] Storage +### ✅ Ported / Working (core) +- [x] Project layout + Makefile targets (build/run/test/fmt) +- [x] RocksDB bindings / integration +- [x] Core DynamoDB types (AttributeValue / Item / Key / TableDescription, etc.) +- [x] Binary key codec (varint length-prefixed segments) +- [x] Binary item codec (TLV encoding / decoding) +- [x] Storage engine: tables + CRUD + scan/query plumbing +- [x] Table-level RW locks (read ops shared / write ops exclusive) +- [x] HTTP server + request routing via `X-Amz-Target` +- [x] DynamoDB JSON (parse + serialize) +- [x] Expression parsing for Query key conditions (basic support) -## 🚧 In Progress (Need to Complete) +--- -### Core Modules +## Now (MVP correctness + polish) +Goal: “aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query” with correct DynamoDB-ish responses. -- [x] **dynamodb/json.odin** - DynamoDB JSON parsing and serialization - - Parse `{"S": "value"}` format - - Serialize AttributeValue to DynamoDB JSON - - Parse request bodies (PutItem, GetItem, etc.) +### 1) HTTP + routing hardening +- [ ] Audit request parsing boundaries: + - Max body size enforcement + - Missing/invalid headers → correct DynamoDB error types + - Content-Type handling (be permissive but consistent) +- [ ] Ensure **all request-scoped allocations** come from the request arena (no accidental long-lived allocs) +- [ ] Standardize error responses: + - `__type` formatting + - `message` field consistency + - status code mapping per error type -- [x] **item_codec/item_codec.odin** - Binary TLV encoding for items - - Encode Item to binary TLV format - - Decode binary TLV back to Item - - Type tag handling for all DynamoDB types +### 2) Storage correctness edge cases +- [ ] Table metadata durability + validation: + - reject duplicate tables + - reject invalid key schema (no HASH, multiple HASH, etc.) +- [ ] Item validation against key schema: + - missing PK/SK errors + - type mismatch errors (S/N/B) +- [ ] Deterministic encoding tests: + - key codec round-trip + - TLV item encode/decode round-trip (nested maps/lists/sets) -- [x] **dynamodb/storage.odin** - Storage engine with RocksDB - - Table metadata management - - create_table, delete_table, describe_table, list_tables - - put_item, get_item, delete_item - - scan, query with pagination - - Table-level RW locks +### 3) Query/Scan pagination parity +- [ ] Make pagination behavior match Zig version + AWS CLI expectations: + - `Limit` + - `ExclusiveStartKey` + - `LastEvaluatedKey` generation (and correct key-type reconstruction) +- [ ] Add “golden” pagination tests: + - query w/ sort key ranges + - scan limit + resume loop -### HTTP Server +### 4) Expression parsing reliability +- [ ] Remove brittle string-scanning for `KeyConditionExpression` extraction: + - Parse expression fields via JSON object lookup (handles whitespace/ordering safely) +- [ ] Add validation + better errors for malformed expressions +- [ ] Expand operator coverage as needed (BETWEEN/begins_with already planned) -- [x] **HTTP server implementation (MOSTLY DONE CONSOLIDATED HANDLER INTO MAIN AND HTTO FILES. NO NEED FOR A STAND ALONE HANDLER LIKE WE DID IN ZIG! JUST PLEASE GO OVER WHAT IS THERE!!!)** - - Accept TCP connections - - Parse HTTP POST requests - - Read JSON bodies - - Send HTTP responses with headers - - Keep-alive support - - Route X-Amz-Target functions (this was the handler in zig but no need for that crap in odin land) - - handle_create_table, handle_put_item, etc. (this was the handler in zig but no need for that crap in odin land) - - Build responses with proper error handling (this was the handler in zig but no need for that crap in odin land) - - Arena allocator integration - - Options (Why we haven't checked this off yet, we need to make sure we chose the right option as the project grows, might make more sense to impliment different option): - - Use `core:net` directly - - Use C FFI with libmicrohttpd - - Use Odin's vendor:microui (if suitable) +--- -### Expression Parsers (Priority 3) +## Next (feature parity with Zig + API completeness) +### 5) UpdateItem / conditional logic groundwork +- [ ] Implement `UpdateItem` (initially minimal: SET for scalar attrs) +- [ ] Add `ConditionExpression` support for Put/Delete/Update (start with simple comparisons) +- [ ] Define internal “update plan” representation (parsed ops → applied mutations) -- [ ] **KeyConditionExpression parser** - - Tokenizer for expressions - - Parse `pk = :pk AND sk > :sk` - - Support begins_with, BETWEEN - - ExpressionAttributeNames/Values +### 6) Response completeness / options +- [ ] `ReturnValues` handling where relevant (NONE/ALL_OLD/UPDATED_NEW etc. — even partial support is useful) +- [ ] `ProjectionExpression` (return subset of attributes) +- [ ] `FilterExpression` (post-query filter for Scan/Query) -- [ ] **UpdateExpression parser** (later) - - SET operations - - REMOVE operations - - ADD operations - - DELETE operations +### 7) Test coverage / tooling +- [ ] Add integration tests mirroring AWS CLI script flows: + - create table → put → get → scan → query → delete +- [ ] Add fuzz-ish tests for: + - JSON parsing robustness + - expression parsing robustness + - TLV decode failure cases (corrupt bytes) -### Credential Support (Priority 4) +--- - - [ ] **Support a way to configure AWS compatible credentials.** - - This is very important because remember when mongo didn't come with a root password by default and everyone who had the port open to the world got their DB ransomed? Yeah, we don't want that to happen +## Later (big features) +These align with the “Future Enhancements” list in ARCHITECTURE.md. -### Replication Support (Priority 5) +### 8) Secondary indexes +- [ ] Global Secondary Indexes (GSI) +- [ ] Local Secondary Indexes (LSI) +- [ ] Index backfill + write-path maintenance - - [ ] **Build C++ Shim in order to use RocksDB's WAL replication helpers** - - [ ] **Add configurator to set instance as a master or slave node and point to proper Target and Destination IPs** - - [ ] **Leverage C++ helpers from shim** +### 9) Batch + transactions +- [ ] BatchWriteItem +- [ ] BatchGetItem +- [ ] Transactions (TransactWriteItems / TransactGetItems) -### Subscribe To Changes Feature (Priority LAST [But keep in mind because semantics we decide now will make this easier later]) +### 10) Performance / ops +- [ ] Connection reuse / keep-alive tuning +- [ ] Bloom filters / RocksDB options tuning for common patterns +- [ ] Optional compression policy (LZ4/Zstd knobs) +- [ ] Parallel scan (segment scanning) - - [ ] **Best-effort notifications (Postgres-ish LISTEN/NOTIFY [in-memory pub/sub fanout. If you’re not connected, you miss it.])** - - Add an in-process “event bus” channels: table-wide, partition-key, item-key, “all”. - - When putItem/deleteItem/updateItem/createTable/... commits successfully publish {op, table, key, timestamp, item?} +--- - - [ ] **Durable change streams (Mongo-ish [append every mutation to a persistent log and let consumers read it with resume tokens.])** - - Create a “changelog” keyspace - - Generate a monotonically increasing sequence by using a stable Per-partition sequence cursor - - Expose via an API (I prefer publishing to MQTT or SSE) +## Replication / WAL +(There is a C++ shim stubbed out for WAL iteration and applying write batches.) +- [ ] Implement WAL iterator: `latest_sequence`, `wal_iter_next` returning writebatch blob +- [ ] Implement apply-writebatch on follower +- [ ] Add a minimal replication test harness (leader generates N ops → follower applies → compare) -## 📋 Testing +--- -- [ ] Unit tests for key_codec -- [ ] Unit tests for item_codec -- [ ] Unit tests for JSON parsing -- [ ] Integration tests with AWS CLI -- [ ] Benchmark suite - -## 🔧 Build & Tooling - -- [ ] Verify Makefile works on macOS -- [ ] Verify Makefile works on Linux -- [ ] Add Docker support (optional) -- [ ] Add install script - -## 📚 Documentation - -- [ ] Code comments for public APIs -- [ ] Usage examples in README -- [ ] API compatibility matrix -- [ ] Performance tuning guide - -## 🎯 Priority Order - -1. **HTTP Server** - Need this to accept requests -2. **JSON Parsing** - Need this to understand DynamoDB format -3. **Storage Engine** - Core CRUD operations -4. **Handlers** - Wire everything together -5. **Item Codec** - Efficient binary storage -6. **Expression Parsers** - Query functionality - -## 📝 Notes - -### Zig → Odin Translation Patterns - -**Memory Management:** -```zig -// Zig -const item = try allocator.create(Item); -defer allocator.destroy(item); -``` -```odin -// Odin -item := new(Item) -// No defer needed if using arena -``` - -**Error Handling:** -```zig -// Zig -fn foo() !Result { - return error.Failed; -} -const x = try foo(); -``` -```odin -// Odin -foo :: proc() -> (Result, bool) { - return {}, false -} -x := foo() or_return -``` - -**Slices:** -```zig -// Zig -const slice: []const u8 = data; -``` -```odin -// Odin -slice: []byte = data -``` - -**Maps:** -```zig -// Zig -var map = std.StringHashMap(Value).init(allocator); -defer map.deinit(); -``` -```odin -// Odin -map := make(map[string]Value) -defer delete(map) -``` - -### Key Decisions - -1. **Use `Maybe(T)` instead of `?T`** - Odin's optional type -2. **Use `or_return` instead of `try`** - Odin's error propagation -3. **Use `context.allocator`** - Implicit allocator from context -4. **Use `#partial switch`** - For union type checking -5. **Use `transmute`** - For zero-cost type conversions - -### Reference Zig Files - -When implementing, reference these Zig files: -- `src/dynamodb/json.zig` - 400 lines, DynamoDB JSON format -- `src/dynamodb/storage.zig` - 460 lines, storage engine -- `src/dynamodb/handler.zig` - 500+ lines, request handlers -- `src/item_codec.zig` - 350 lines, TLV encoding -- `src/http.zig` - 250 lines, HTTP server - -### Quick Test Commands - -```bash -# Build and test -make build -make test - -# Run server -make run - -# Test with AWS CLI -aws dynamodb list-tables --endpoint-url http://localhost:8002 -``` +## Housekeeping +- [ ] Fix TODO hygiene: keep this file short and “actionable” +- [ ] Add a CONTRIBUTING quick checklist (allocator rules, formatting, tests) +- [ ] Add “known limitations” section in README (unsupported DynamoDB features)