Files
jormun-db/TODO.md
2026-02-15 15:30:24 -05:00

4.5 KiB
Raw Blame History

JormunDB (Odin rewrite) — TODO

This tracks the rewrite from Zig (ZynamoDB) → Odin (JormunDB), and whats left to stabilize + extend.

Status Snapshot

Ported / Working (core)

  • Project layout + Makefile targets (build/run/test/fmt)
  • RocksDB bindings / integration
  • Core DynamoDB types (AttributeValue / Item / Key / TableDescription, etc.)
  • Binary key codec (varint length-prefixed segments)
  • Binary item codec (TLV encoding / decoding)
  • Storage engine: tables + CRUD + scan/query plumbing
  • Table-level RW locks (read ops shared / write ops exclusive)
  • HTTP server + request routing via X-Amz-Target
  • DynamoDB JSON (parse + serialize)
  • Expression parsing for Query key conditions (basic support)

Now (MVP correctness + polish)

Goal: “aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query” with correct DynamoDB-ish responses.

1) HTTP + routing hardening

  • Audit request parsing boundaries:
    • Max body size enforcement
    • Missing/invalid headers → correct DynamoDB error types
    • Content-Type handling (be permissive but consistent)
  • Ensure all request-scoped allocations come from the request arena (no accidental long-lived allocs)
  • Standardize error responses:
    • __type formatting
    • message field consistency
    • status code mapping per error type

2) Storage correctness edge cases

  • Table metadata durability + validation:
    • reject duplicate tables
    • reject invalid key schema (no HASH, multiple HASH, etc.)
  • Item validation against key schema:
    • missing PK/SK errors
    • type mismatch errors (S/N/B)
  • Deterministic encoding tests:
    • key codec round-trip
    • TLV item encode/decode round-trip (nested maps/lists/sets)

3) Query/Scan pagination parity

  • Make pagination behavior match Zig version + AWS CLI expectations:
    • Limit
    • ExclusiveStartKey
    • LastEvaluatedKey generation (and correct key-type reconstruction)
  • Add “golden” pagination tests:
    • query w/ sort key ranges
    • scan limit + resume loop

4) Expression parsing reliability

  • Remove brittle string-scanning for KeyConditionExpression extraction:
    • Parse expression fields via JSON object lookup (handles whitespace/ordering safely)
  • Add validation + better errors for malformed expressions
  • Expand operator coverage as needed (BETWEEN/begins_with already planned)

Next (feature parity with Zig + API completeness)

5) UpdateItem / conditional logic groundwork

  • Implement UpdateItem (initially minimal: SET for scalar attrs)
  • Add ConditionExpression support for Put/Delete/Update (start with simple comparisons)
  • Define internal “update plan” representation (parsed ops → applied mutations)

6) Response completeness / options

  • ReturnValues handling where relevant (NONE/ALL_OLD/UPDATED_NEW etc. — even partial support is useful)
  • ProjectionExpression (return subset of attributes)
  • FilterExpression (post-query filter for Scan/Query)

7) Test coverage / tooling

  • Add integration tests mirroring AWS CLI script flows:
    • create table → put → get → scan → query → delete
  • Add fuzz-ish tests for:
    • JSON parsing robustness
    • expression parsing robustness
    • TLV decode failure cases (corrupt bytes)

Later (big features)

These align with the “Future Enhancements” list in ARCHITECTURE.md.

8) Secondary indexes

  • Global Secondary Indexes (GSI)
  • Local Secondary Indexes (LSI)
  • Index backfill + write-path maintenance

9) Batch + transactions

  • BatchWriteItem
  • BatchGetItem
  • Transactions (TransactWriteItems / TransactGetItems)

10) Performance / ops

  • Connection reuse / keep-alive tuning
  • Bloom filters / RocksDB options tuning for common patterns
  • Optional compression policy (LZ4/Zstd knobs)
  • Parallel scan (segment scanning)

Replication / WAL

(There is a C++ shim stubbed out for WAL iteration and applying write batches.)

  • Implement WAL iterator: latest_sequence, wal_iter_next returning writebatch blob
  • Implement apply-writebatch on follower
  • Add a minimal replication test harness (leader generates N ops → follower applies → compare)

Housekeeping

  • Fix TODO hygiene: keep this file short and “actionable”
  • Add a CONTRIBUTING quick checklist (allocator rules, formatting, tests)
  • Add “known limitations” section in README (unsupported DynamoDB features)