Files
jormun-db/TODO.md
2026-02-16 00:18:20 -05:00

5.1 KiB

JormunDB (Odin rewrite) — TODO

This tracks what's left to stabilize + extend the project

Now (MVP correctness + polish)

Goal: "aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query" with correct DynamoDB-ish responses.

1) HTTP + routing hardening

  • Audit request parsing boundaries:
    • Max body size enforcement (config exists, need to verify enforcement path)
    • Missing/invalid headers → correct DynamoDB error types
    • Content-Type handling (be permissive but consistent)
  • Ensure all request-scoped allocations come from the request arena (no accidental long-lived allocs)
    • Verified: handle_connection in http.odin sets context.allocator = request_alloc
    • Long-lived data (table metadata, locks) explicitly uses engine.allocator
  • Standardize error responses:
    • __type formatting — done, uses com.amazonaws.dynamodb.v20120810#ErrorType
    • message field consistency — done
    • Status code mapping per error type — DONE: centralized handle_storage_error + make_error_response now maps InternalServerError→500, everything else→400
    • Missing X-Amz-Target now returns SerializationException (matches real DynamoDB)

2) Storage correctness edge cases

  • Table metadata durability + validation:
    • Reject duplicate tables — done in create_table (checks existing meta key)
    • Reject invalid key schema — done in parse_key_schema (no HASH, multiple HASH, etc.)
  • Item validation against key schema:
    • Missing PK/SK errors — done in key_from_item
    • Type mismatch errors (S/N/B) — DONE: new validate_item_key_types proc checks item key attr types against AttributeDefinitions
  • Deterministic encoding tests:
    • Key codec round-trip
    • TLV item encode/decode round-trip (nested maps/lists/sets)

3) Query/Scan pagination parity

  • Make pagination behavior match AWS CLI expectations:
    • Limit — done
    • ExclusiveStartKey — done (parsed via JSON object lookup with key schema type reconstruction)
    • LastEvaluatedKey generation — FIXED: now saves key of last returned item (not next unread item); only emits when more results exist
  • Add "golden" pagination tests:
    • Query w/ sort key ranges
    • Scan limit + resume loop

4) Expression parsing reliability

  • Remove brittle string-scanning for KeyConditionExpression extraction:
    • DONE: parse_key_condition_expression_string uses JSON object lookup (handles whitespace/ordering safely)
  • Add validation + better errors for malformed expressions
  • Expand operator coverage: BETWEEN and begins_with are implemented in parser
  • Sort key condition filtering in queryDONE: query() now accepts optional Sort_Key_Condition and applies it (=, <, <=, >, >=, BETWEEN, begins_with)

Next (feature parity with Zig + API completeness)

5) UpdateItem / conditional logic groundwork

  • UpdateItem handler registered in router (currently returns clear "not yet supported" error)
  • Implement UpdateItem (initially minimal: SET for scalar attrs)
    • UpdateItem needs UPDATED_NEW/UPDATED_OLD response filtering for perfect parity with Dynamo
  • Add ConditionExpression support for Put/Delete/Update (start with simple comparisons)
  • Define internal "update plan" representation (parsed ops → applied mutations)

6) Response completeness / options

  • ReturnValues handling where relevant (NONE/ALL_OLD/UPDATED_NEW etc. — even partial support is useful)
  • ProjectionExpression (return subset of attributes)
  • FilterExpression (post-query filter for Scan/Query)

7) Test coverage / tooling

  • Add integration tests mirroring AWS CLI script flows:
    • create table → put → get → scan → query → delete
  • Add fuzz-ish tests for:
    • JSON parsing robustness
    • expression parsing robustness
    • TLV decode failure cases (corrupt bytes)

Later (big features)

These align with the "Future Enhancements" list in ARCHITECTURE.md.

8) Secondary indexes

  • Global Secondary Indexes (GSI)
  • Local Secondary Indexes (LSI)
  • Index backfill + write-path maintenance

9) Batch + transactions

  • BatchWriteItem
  • BatchGetItem
  • Transactions (TransactWriteItems / TransactGetItems)

10) Performance / ops

  • Connection reuse / keep-alive tuning
  • Bloom filters / RocksDB options tuning for common patterns
  • Optional compression policy (LZ4/Zstd knobs)
  • Parallel scan (segment scanning)

Replication / WAL

(There is a C++ shim stubbed out for WAL iteration and applying write batches.)

  • Implement WAL iterator: latest_sequence, wal_iter_next returning writebatch blob
  • Implement apply-writebatch on follower
  • Add a minimal replication test harness (leader generates N ops → follower applies → compare)

Housekeeping

  • Fix TODO hygiene: keep this file short and "actionable"
    • Added "Bug Fixes Applied" section documenting what changed and why
  • Add a CONTRIBUTING quick checklist (allocator rules, formatting, tests)
  • Add "known limitations" section in README (unsupported DynamoDB features)