Files
jormun-db/TODO.md
2026-02-15 20:57:16 -05:00

7.9 KiB

JormunDB (Odin rewrite) — TODO

This tracks the rewrite from Zig (ZynamoDB) → Odin (JormunDB), and what's left to stabilize + extend.

Status Snapshot

Ported / Working (core)

  • Project layout + Makefile targets (build/run/test/fmt)
  • RocksDB bindings / integration
  • Core DynamoDB types (AttributeValue / Item / Key / TableDescription, etc.)
  • Binary key codec (varint length-prefixed segments)
  • Binary item codec (TLV encoding / decoding)
  • Storage engine: tables + CRUD + scan/query plumbing
  • Table-level RW locks (read ops shared / write ops exclusive)
  • HTTP server + request routing via X-Amz-Target
  • DynamoDB JSON (parse + serialize)
  • Expression parsing for Query key conditions (basic support)

Now (MVP correctness + polish)

Goal: "aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query" with correct DynamoDB-ish responses.

1) HTTP + routing hardening

  • Audit request parsing boundaries:
    • Max body size enforcement (config exists, need to verify enforcement path)
    • Missing/invalid headers → correct DynamoDB error types
    • Content-Type handling (be permissive but consistent)
  • Ensure all request-scoped allocations come from the request arena (no accidental long-lived allocs)
    • Verified: handle_connection in http.odin sets context.allocator = request_alloc
    • Long-lived data (table metadata, locks) explicitly uses engine.allocator
  • Standardize error responses:
    • __type formatting — done, uses com.amazonaws.dynamodb.v20120810#ErrorType
    • message field consistency — done
    • Status code mapping per error type — DONE: centralized handle_storage_error + make_error_response now maps InternalServerError→500, everything else→400
    • Missing X-Amz-Target now returns SerializationException (matches real DynamoDB)

2) Storage correctness edge cases

  • Table metadata durability + validation:
    • Reject duplicate tables — done in create_table (checks existing meta key)
    • Reject invalid key schema — done in parse_key_schema (no HASH, multiple HASH, etc.)
  • Item validation against key schema:
    • Missing PK/SK errors — done in key_from_item
    • Type mismatch errors (S/N/B) — DONE: new validate_item_key_types proc checks item key attr types against AttributeDefinitions
  • Deterministic encoding tests:
    • Key codec round-trip
    • TLV item encode/decode round-trip (nested maps/lists/sets)

3) Query/Scan pagination parity

  • Make pagination behavior match AWS CLI expectations:
    • Limit — done
    • ExclusiveStartKey — done (parsed via JSON object lookup with key schema type reconstruction)
    • LastEvaluatedKey generation — FIXED: now saves key of last returned item (not next unread item); only emits when more results exist
  • Add "golden" pagination tests:
    • Query w/ sort key ranges
    • Scan limit + resume loop

4) Expression parsing reliability

  • Remove brittle string-scanning for KeyConditionExpression extraction:
    • DONE: parse_key_condition_expression_string uses JSON object lookup (handles whitespace/ordering safely)
  • Add validation + better errors for malformed expressions
  • Expand operator coverage: BETWEEN and begins_with are implemented in parser
  • Sort key condition filtering in queryDONE: query() now accepts optional Sort_Key_Condition and applies it (=, <, <=, >, >=, BETWEEN, begins_with)

Next (feature parity with Zig + API completeness)

5) UpdateItem / conditional logic groundwork

  • UpdateItem handler registered in router (currently returns clear "not yet supported" error)
  • Implement UpdateItem (initially minimal: SET for scalar attrs)
  • Add ConditionExpression support for Put/Delete/Update (start with simple comparisons)
  • Define internal "update plan" representation (parsed ops → applied mutations)

6) Response completeness / options

  • ReturnValues handling where relevant (NONE/ALL_OLD/UPDATED_NEW etc. — even partial support is useful)
  • ProjectionExpression (return subset of attributes)
  • FilterExpression (post-query filter for Scan/Query)

7) Test coverage / tooling

  • Add integration tests mirroring AWS CLI script flows:
    • create table → put → get → scan → query → delete
  • Add fuzz-ish tests for:
    • JSON parsing robustness
    • expression parsing robustness
    • TLV decode failure cases (corrupt bytes)

Bug Fixes Applied This Session

Pagination (scan + query)

Bug: last_evaluated_key was set to the key of the next unread item (the item at count == limit). When the client resumed with that key as ExclusiveStartKey, it would seek-then-skip, dropping one item from the result set.

Fix: Now tracks the key of the last successfully returned item. Only emits LastEvaluatedKey when we confirm there are more items beyond the returned set (via has_more flag).

Sort key condition filtering

Bug: query() performed a partition-prefix scan but never applied the sort key condition (=, <, BETWEEN, begins_with, etc.) from KeyConditionExpression. All items in the partition were returned regardless of sort key predicates.

Fix: query() now accepts an optional Sort_Key_Condition parameter. The handler extracts it from the parsed Key_Condition and passes it through. evaluate_sort_key_condition() compares the item's SK attribute against the condition using string comparison (matching DynamoDB's lexicographic semantics for S/N/B keys).

Write locking

Bug: put_item and delete_item acquired shared (read) locks. Multiple concurrent writes to the same table could interleave without mutual exclusion.

Fix: Both now acquire exclusive (write) locks via sync.rw_mutex_lock. Read operations (get_item, scan, query) continue to use shared locks.

delete_table item cleanup

Bug: delete_table only deleted the metadata key, leaving all data items orphaned in RocksDB.

Fix: Before deleting metadata, delete_table now iterates over all keys with the table's data prefix and deletes them individually.

Item key type validation

New: put_item now validates that the item's key attribute types match the table's AttributeDefinitions. E.g., if PK is declared as S, putting an item with a numeric PK is rejected with Invalid_Key.

Error response standardization

Fix: Centralized all storage-error-to-HTTP-error mapping in handle_storage_error. InternalServerError maps to HTTP 500; all client errors (validation, not-found, etc.) map to HTTP 400. Missing X-Amz-Target now returns SerializationException to match real DynamoDB behavior.


Later (big features)

These align with the "Future Enhancements" list in ARCHITECTURE.md.

8) Secondary indexes

  • Global Secondary Indexes (GSI)
  • Local Secondary Indexes (LSI)
  • Index backfill + write-path maintenance

9) Batch + transactions

  • BatchWriteItem
  • BatchGetItem
  • Transactions (TransactWriteItems / TransactGetItems)

10) Performance / ops

  • Connection reuse / keep-alive tuning
  • Bloom filters / RocksDB options tuning for common patterns
  • Optional compression policy (LZ4/Zstd knobs)
  • Parallel scan (segment scanning)

Replication / WAL

(There is a C++ shim stubbed out for WAL iteration and applying write batches.)

  • Implement WAL iterator: latest_sequence, wal_iter_next returning writebatch blob
  • Implement apply-writebatch on follower
  • Add a minimal replication test harness (leader generates N ops → follower applies → compare)

Housekeeping

  • Fix TODO hygiene: keep this file short and "actionable"
    • Added "Bug Fixes Applied" section documenting what changed and why
  • Add a CONTRIBUTING quick checklist (allocator rules, formatting, tests)
  • Add "known limitations" section in README (unsupported DynamoDB features)