7.9 KiB
JormunDB (Odin rewrite) — TODO
This tracks the rewrite from Zig (ZynamoDB) → Odin (JormunDB), and what's left to stabilize + extend.
Status Snapshot
✅ Ported / Working (core)
- Project layout + Makefile targets (build/run/test/fmt)
- RocksDB bindings / integration
- Core DynamoDB types (AttributeValue / Item / Key / TableDescription, etc.)
- Binary key codec (varint length-prefixed segments)
- Binary item codec (TLV encoding / decoding)
- Storage engine: tables + CRUD + scan/query plumbing
- Table-level RW locks (read ops shared / write ops exclusive)
- HTTP server + request routing via
X-Amz-Target - DynamoDB JSON (parse + serialize)
- Expression parsing for Query key conditions (basic support)
Now (MVP correctness + polish)
Goal: "aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query" with correct DynamoDB-ish responses.
1) HTTP + routing hardening
- Audit request parsing boundaries:
- Max body size enforcement (config exists, need to verify enforcement path)
- Missing/invalid headers → correct DynamoDB error types
- Content-Type handling (be permissive but consistent)
- Ensure all request-scoped allocations come from the request arena (no accidental long-lived allocs)
- Verified:
handle_connectionin http.odin setscontext.allocator = request_alloc - Long-lived data (table metadata, locks) explicitly uses
engine.allocator
- Verified:
- Standardize error responses:
__typeformatting — done, usescom.amazonaws.dynamodb.v20120810#ErrorTypemessagefield consistency — done- Status code mapping per error type — DONE: centralized
handle_storage_error+make_error_responsenow maps InternalServerError→500, everything else→400 - Missing X-Amz-Target now returns
SerializationException(matches real DynamoDB)
2) Storage correctness edge cases
- Table metadata durability + validation:
- Reject duplicate tables — done in
create_table(checks existing meta key) - Reject invalid key schema — done in
parse_key_schema(no HASH, multiple HASH, etc.)
- Reject duplicate tables — done in
- Item validation against key schema:
- Missing PK/SK errors — done in
key_from_item - Type mismatch errors (S/N/B) — DONE: new
validate_item_key_typesproc checks item key attr types against AttributeDefinitions
- Missing PK/SK errors — done in
- Deterministic encoding tests:
- Key codec round-trip
- TLV item encode/decode round-trip (nested maps/lists/sets)
3) Query/Scan pagination parity
- Make pagination behavior match AWS CLI expectations:
Limit— doneExclusiveStartKey— done (parsed via JSON object lookup with key schema type reconstruction)LastEvaluatedKeygeneration — FIXED: now saves key of last returned item (not next unread item); only emits when more results exist
- Add "golden" pagination tests:
- Query w/ sort key ranges
- Scan limit + resume loop
4) Expression parsing reliability
- Remove brittle string-scanning for
KeyConditionExpressionextraction:- DONE:
parse_key_condition_expression_stringuses JSON object lookup (handles whitespace/ordering safely)
- DONE:
- Add validation + better errors for malformed expressions
- Expand operator coverage: BETWEEN and begins_with are implemented in parser
- Sort key condition filtering in query — DONE:
query()now accepts optionalSort_Key_Conditionand applies it (=, <, <=, >, >=, BETWEEN, begins_with)
Next (feature parity with Zig + API completeness)
5) UpdateItem / conditional logic groundwork
UpdateItemhandler registered in router (currently returns clear "not yet supported" error)- Implement
UpdateItem(initially minimal: SET for scalar attrs) - Add
ConditionExpressionsupport for Put/Delete/Update (start with simple comparisons) - Define internal "update plan" representation (parsed ops → applied mutations)
6) Response completeness / options
ReturnValueshandling where relevant (NONE/ALL_OLD/UPDATED_NEW etc. — even partial support is useful)ProjectionExpression(return subset of attributes)FilterExpression(post-query filter for Scan/Query)
7) Test coverage / tooling
- Add integration tests mirroring AWS CLI script flows:
- create table → put → get → scan → query → delete
- Add fuzz-ish tests for:
- JSON parsing robustness
- expression parsing robustness
- TLV decode failure cases (corrupt bytes)
Bug Fixes Applied This Session
Pagination (scan + query)
Bug: last_evaluated_key was set to the key of the next unread item (the item at count == limit). When the client resumed with that key as ExclusiveStartKey, it would seek-then-skip, dropping one item from the result set.
Fix: Now tracks the key of the last successfully returned item. Only emits LastEvaluatedKey when we confirm there are more items beyond the returned set (via has_more flag).
Sort key condition filtering
Bug: query() performed a partition-prefix scan but never applied the sort key condition (=, <, BETWEEN, begins_with, etc.) from KeyConditionExpression. All items in the partition were returned regardless of sort key predicates.
Fix: query() now accepts an optional Sort_Key_Condition parameter. The handler extracts it from the parsed Key_Condition and passes it through. evaluate_sort_key_condition() compares the item's SK attribute against the condition using string comparison (matching DynamoDB's lexicographic semantics for S/N/B keys).
Write locking
Bug: put_item and delete_item acquired shared (read) locks. Multiple concurrent writes to the same table could interleave without mutual exclusion.
Fix: Both now acquire exclusive (write) locks via sync.rw_mutex_lock. Read operations (get_item, scan, query) continue to use shared locks.
delete_table item cleanup
Bug: delete_table only deleted the metadata key, leaving all data items orphaned in RocksDB.
Fix: Before deleting metadata, delete_table now iterates over all keys with the table's data prefix and deletes them individually.
Item key type validation
New: put_item now validates that the item's key attribute types match the table's AttributeDefinitions. E.g., if PK is declared as S, putting an item with a numeric PK is rejected with Invalid_Key.
Error response standardization
Fix: Centralized all storage-error-to-HTTP-error mapping in handle_storage_error. InternalServerError maps to HTTP 500; all client errors (validation, not-found, etc.) map to HTTP 400. Missing X-Amz-Target now returns SerializationException to match real DynamoDB behavior.
Later (big features)
These align with the "Future Enhancements" list in ARCHITECTURE.md.
8) Secondary indexes
- Global Secondary Indexes (GSI)
- Local Secondary Indexes (LSI)
- Index backfill + write-path maintenance
9) Batch + transactions
- BatchWriteItem
- BatchGetItem
- Transactions (TransactWriteItems / TransactGetItems)
10) Performance / ops
- Connection reuse / keep-alive tuning
- Bloom filters / RocksDB options tuning for common patterns
- Optional compression policy (LZ4/Zstd knobs)
- Parallel scan (segment scanning)
Replication / WAL
(There is a C++ shim stubbed out for WAL iteration and applying write batches.)
- Implement WAL iterator:
latest_sequence,wal_iter_nextreturning writebatch blob - Implement apply-writebatch on follower
- Add a minimal replication test harness (leader generates N ops → follower applies → compare)
Housekeeping
- Fix TODO hygiene: keep this file short and "actionable"
- Added "Bug Fixes Applied" section documenting what changed and why
- Add a CONTRIBUTING quick checklist (allocator rules, formatting, tests)
- Add "known limitations" section in README (unsupported DynamoDB features)