# JormunDB (Odin rewrite) — TODO This tracks what's left to stabilize + extend the project ## Now (MVP correctness + polish) Goal: "aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query" with correct DynamoDB-ish responses. ### 1) HTTP + routing hardening - [ ] Audit request parsing boundaries: - Max body size enforcement (config exists, need to verify enforcement path) - Missing/invalid headers → correct DynamoDB error types - Content-Type handling (be permissive but consistent) - [x] Ensure **all request-scoped allocations** come from the request arena (no accidental long-lived allocs) - Verified: `handle_connection` in http.odin sets `context.allocator = request_alloc` - Long-lived data (table metadata, locks) explicitly uses `engine.allocator` - [x] Standardize error responses: - `__type` formatting — done, uses `com.amazonaws.dynamodb.v20120810#ErrorType` - `message` field consistency — done - Status code mapping per error type — **DONE**: centralized `handle_storage_error` + `make_error_response` now maps InternalServerError→500, everything else→400 - Missing X-Amz-Target now returns `SerializationException` (matches real DynamoDB) ### 2) Storage correctness edge cases - [x] Table metadata durability + validation: - [x] Reject duplicate tables — done in `create_table` (checks existing meta key) - [x] Reject invalid key schema — done in `parse_key_schema` (no HASH, multiple HASH, etc.) - [x] Item validation against key schema: - [x] Missing PK/SK errors — done in `key_from_item` - [x] Type mismatch errors (S/N/B) — **DONE**: new `validate_item_key_types` proc checks item key attr types against AttributeDefinitions - [ ] Deterministic encoding tests: - [ ] Key codec round-trip - [ ] TLV item encode/decode round-trip (nested maps/lists/sets) ### 3) Query/Scan pagination parity - [x] Make pagination behavior match AWS CLI expectations: - [x] `Limit` — done - [x] `ExclusiveStartKey` — done (parsed via JSON object lookup with key schema type reconstruction) - [x] `LastEvaluatedKey` generation — **FIXED**: now saves key of *last returned item* (not next unread item); only emits when more results exist - [ ] Add "golden" pagination tests: - [ ] Query w/ sort key ranges - [ ] Scan limit + resume loop ### 4) Expression parsing reliability - [x] Remove brittle string-scanning for `KeyConditionExpression` extraction: - **DONE**: `parse_key_condition_expression_string` uses JSON object lookup (handles whitespace/ordering safely) - [ ] Add validation + better errors for malformed expressions - [x] Expand operator coverage: BETWEEN and begins_with are implemented in parser - [x] **Sort key condition filtering in query** — **DONE**: `query()` now accepts optional `Sort_Key_Condition` and applies it (=, <, <=, >, >=, BETWEEN, begins_with) ### 5) Service Features - [ ] Configuration settings like environment variables for defining users and credentials - [ ] Configuration settings for setting up master and replica nodes ### 6) Test coverage / tooling - [ ] Add integration tests mirroring AWS CLI script flows: - create table → put → get → scan → query → delete - [ ] Add fuzz-ish tests for: - JSON parsing robustness - expression parsing robustness - TLV decode failure cases (corrupt bytes) ### 7) Secondary indexes - [x] Global Secondary Indexes (GSI) - [ ] Local Secondary Indexes (LSI) - [ ] Index backfill (existing data when GSI added to populated table) - [x] Write-path maintenance (GSI) ### 8) Performance / ops - [ ] Connection reuse / keep-alive tuning - [ ] Bloom filters / RocksDB options tuning for common patterns - [ ] Optional compression policy (LZ4/Zstd knobs) - [ ] Parallel scan (segment scanning) ### 9) Replication / WAL (There is a C++ shim stubbed out for WAL iteration and applying write batches.) - [ ] Implement WAL iterator: `latest_sequence`, `wal_iter_next` returning writebatch blob - [ ] Implement apply-writebatch on follower - [ ] Add a minimal replication test harness (leader generates N ops → follower applies → compare)