2026-02-15 15:30:24 -05:00
# JormunDB (Odin rewrite) — TODO
2026-02-16 00:18:20 -05:00
This tracks what's left to stabilize + extend the project
2026-02-15 15:30:24 -05:00
## Now (MVP correctness + polish)
2026-02-15 20:57:16 -05:00
Goal: "aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query" with correct DynamoDB-ish responses.
2026-02-15 15:30:24 -05:00
### 1) HTTP + routing hardening
- [ ] Audit request parsing boundaries:
2026-02-15 20:57:16 -05:00
- Max body size enforcement (config exists, need to verify enforcement path)
2026-02-15 15:30:24 -05:00
- Missing/invalid headers → correct DynamoDB error types
- Content-Type handling (be permissive but consistent)
2026-02-15 20:57:16 -05:00
- [x] Ensure **all request-scoped allocations ** come from the request arena (no accidental long-lived allocs)
- Verified: `handle_connection` in http.odin sets `context.allocator = request_alloc`
- Long-lived data (table metadata, locks) explicitly uses `engine.allocator`
- [x] Standardize error responses:
- `__type` formatting — done, uses `com.amazonaws.dynamodb.v20120810#ErrorType`
- `message` field consistency — done
- Status code mapping per error type — **DONE ** : centralized `handle_storage_error` + `make_error_response` now maps InternalServerError→500, everything else→400
- Missing X-Amz-Target now returns `SerializationException` (matches real DynamoDB)
2026-02-15 15:30:24 -05:00
### 2) Storage correctness edge cases
2026-02-15 20:57:16 -05:00
- [x] Table metadata durability + validation:
- [x] Reject duplicate tables — done in `create_table` (checks existing meta key)
- [x] Reject invalid key schema — done in `parse_key_schema` (no HASH, multiple HASH, etc.)
- [x] Item validation against key schema:
- [x] Missing PK/SK errors — done in `key_from_item`
- [x] Type mismatch errors (S/N/B) — **DONE ** : new `validate_item_key_types` proc checks item key attr types against AttributeDefinitions
2026-02-15 15:30:24 -05:00
- [ ] Deterministic encoding tests:
2026-02-15 20:57:16 -05:00
- [ ] Key codec round-trip
- [ ] TLV item encode/decode round-trip (nested maps/lists/sets)
2026-02-15 15:30:24 -05:00
### 3) Query/Scan pagination parity
2026-02-15 20:57:16 -05:00
- [x] Make pagination behavior match AWS CLI expectations:
- [x] `Limit` — done
- [x] `ExclusiveStartKey` — done (parsed via JSON object lookup with key schema type reconstruction)
- [x] `LastEvaluatedKey` generation — **FIXED ** : now saves key of * last returned item * (not next unread item); only emits when more results exist
- [ ] Add "golden" pagination tests:
- [ ] Query w/ sort key ranges
- [ ] Scan limit + resume loop
2026-02-15 15:30:24 -05:00
### 4) Expression parsing reliability
2026-02-15 20:57:16 -05:00
- [x] Remove brittle string-scanning for `KeyConditionExpression` extraction:
- **DONE**: `parse_key_condition_expression_string` uses JSON object lookup (handles whitespace/ordering safely)
2026-02-15 15:30:24 -05:00
- [ ] Add validation + better errors for malformed expressions
2026-02-15 20:57:16 -05:00
- [x] Expand operator coverage: BETWEEN and begins_with are implemented in parser
- [x] **Sort key condition filtering in query ** — **DONE ** : `query()` now accepts optional `Sort_Key_Condition` and applies it (=, <, <=, >, >=, BETWEEN, begins_with)
2026-02-15 15:30:24 -05:00
2026-02-16 01:40:51 -05:00
### 5) Service Features
- [ ] Configuration settings like environment variables for defining users and credentials
- [ ] Configuration settings for setting up master and replica nodes
2026-02-15 15:30:24 -05:00
2026-02-16 01:40:51 -05:00
### 6) Test coverage / tooling
2026-02-15 15:30:24 -05:00
- [ ] Add integration tests mirroring AWS CLI script flows:
- create table → put → get → scan → query → delete
- [ ] Add fuzz-ish tests for:
- JSON parsing robustness
- expression parsing robustness
- TLV decode failure cases (corrupt bytes)
2026-02-16 01:40:51 -05:00
### 7) Secondary indexes
2026-02-16 03:01:01 -05:00
- [x] Global Secondary Indexes (GSI)
2026-02-15 15:30:24 -05:00
- [ ] Local Secondary Indexes (LSI)
2026-02-16 03:01:01 -05:00
- [ ] Index backfill (existing data when GSI added to populated table)
- [x] Write-path maintenance (GSI)
2026-02-15 15:30:24 -05:00
2026-02-16 01:40:51 -05:00
### 8) Performance / ops
2026-02-15 15:30:24 -05:00
- [ ] Connection reuse / keep-alive tuning
- [ ] Bloom filters / RocksDB options tuning for common patterns
- [ ] Optional compression policy (LZ4/Zstd knobs)
- [ ] Parallel scan (segment scanning)
2026-02-16 01:40:51 -05:00
### 9) Replication / WAL
2026-02-15 15:30:24 -05:00
(There is a C++ shim stubbed out for WAL iteration and applying write batches.)
- [ ] Implement WAL iterator: `latest_sequence` , `wal_iter_next` returning writebatch blob
- [ ] Implement apply-writebatch on follower
- [ ] Add a minimal replication test harness (leader generates N ops → follower applies → compare)