Files
jormun-db/TODO.md
2026-02-15 15:30:24 -05:00

117 lines
4.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# JormunDB (Odin rewrite) — TODO
This tracks the rewrite from Zig (ZynamoDB) → Odin (JormunDB), and whats left to stabilize + extend.
## Status Snapshot
### ✅ Ported / Working (core)
- [x] Project layout + Makefile targets (build/run/test/fmt)
- [x] RocksDB bindings / integration
- [x] Core DynamoDB types (AttributeValue / Item / Key / TableDescription, etc.)
- [x] Binary key codec (varint length-prefixed segments)
- [x] Binary item codec (TLV encoding / decoding)
- [x] Storage engine: tables + CRUD + scan/query plumbing
- [x] Table-level RW locks (read ops shared / write ops exclusive)
- [x] HTTP server + request routing via `X-Amz-Target`
- [x] DynamoDB JSON (parse + serialize)
- [x] Expression parsing for Query key conditions (basic support)
---
## Now (MVP correctness + polish)
Goal: “aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query” with correct DynamoDB-ish responses.
### 1) HTTP + routing hardening
- [ ] Audit request parsing boundaries:
- Max body size enforcement
- Missing/invalid headers → correct DynamoDB error types
- Content-Type handling (be permissive but consistent)
- [ ] Ensure **all request-scoped allocations** come from the request arena (no accidental long-lived allocs)
- [ ] Standardize error responses:
- `__type` formatting
- `message` field consistency
- status code mapping per error type
### 2) Storage correctness edge cases
- [ ] Table metadata durability + validation:
- reject duplicate tables
- reject invalid key schema (no HASH, multiple HASH, etc.)
- [ ] Item validation against key schema:
- missing PK/SK errors
- type mismatch errors (S/N/B)
- [ ] Deterministic encoding tests:
- key codec round-trip
- TLV item encode/decode round-trip (nested maps/lists/sets)
### 3) Query/Scan pagination parity
- [ ] Make pagination behavior match Zig version + AWS CLI expectations:
- `Limit`
- `ExclusiveStartKey`
- `LastEvaluatedKey` generation (and correct key-type reconstruction)
- [ ] Add “golden” pagination tests:
- query w/ sort key ranges
- scan limit + resume loop
### 4) Expression parsing reliability
- [ ] Remove brittle string-scanning for `KeyConditionExpression` extraction:
- Parse expression fields via JSON object lookup (handles whitespace/ordering safely)
- [ ] Add validation + better errors for malformed expressions
- [ ] Expand operator coverage as needed (BETWEEN/begins_with already planned)
---
## Next (feature parity with Zig + API completeness)
### 5) UpdateItem / conditional logic groundwork
- [ ] Implement `UpdateItem` (initially minimal: SET for scalar attrs)
- [ ] Add `ConditionExpression` support for Put/Delete/Update (start with simple comparisons)
- [ ] Define internal “update plan” representation (parsed ops → applied mutations)
### 6) Response completeness / options
- [ ] `ReturnValues` handling where relevant (NONE/ALL_OLD/UPDATED_NEW etc. — even partial support is useful)
- [ ] `ProjectionExpression` (return subset of attributes)
- [ ] `FilterExpression` (post-query filter for Scan/Query)
### 7) Test coverage / tooling
- [ ] Add integration tests mirroring AWS CLI script flows:
- create table → put → get → scan → query → delete
- [ ] Add fuzz-ish tests for:
- JSON parsing robustness
- expression parsing robustness
- TLV decode failure cases (corrupt bytes)
---
## Later (big features)
These align with the “Future Enhancements” list in ARCHITECTURE.md.
### 8) Secondary indexes
- [ ] Global Secondary Indexes (GSI)
- [ ] Local Secondary Indexes (LSI)
- [ ] Index backfill + write-path maintenance
### 9) Batch + transactions
- [ ] BatchWriteItem
- [ ] BatchGetItem
- [ ] Transactions (TransactWriteItems / TransactGetItems)
### 10) Performance / ops
- [ ] Connection reuse / keep-alive tuning
- [ ] Bloom filters / RocksDB options tuning for common patterns
- [ ] Optional compression policy (LZ4/Zstd knobs)
- [ ] Parallel scan (segment scanning)
---
## Replication / WAL
(There is a C++ shim stubbed out for WAL iteration and applying write batches.)
- [ ] Implement WAL iterator: `latest_sequence`, `wal_iter_next` returning writebatch blob
- [ ] Implement apply-writebatch on follower
- [ ] Add a minimal replication test harness (leader generates N ops → follower applies → compare)
---
## Housekeeping
- [ ] Fix TODO hygiene: keep this file short and “actionable”
- [ ] Add a CONTRIBUTING quick checklist (allocator rules, formatting, tests)
- [ ] Add “known limitations” section in README (unsupported DynamoDB features)