refresh todo

This commit is contained in:
2026-02-15 15:30:24 -05:00
parent 7a2f26b75d
commit c1f72cee8b

281
TODO.md
View File

@@ -1,209 +1,116 @@
# JormunDB Implementation TODO # JormunDB (Odin rewrite) — TODO
This tracks the rewrite from Zig to Odin and remaining features. This tracks the rewrite from Zig (ZynamoDB) → Odin (JormunDB), and whats left to stabilize + extend.
## ✅ Completed ## Status Snapshot
- [x] Project structure ### ✅ Ported / Working (core)
- [x] Makefile with build/run/test targets - [x] Project layout + Makefile targets (build/run/test/fmt)
- [x] README with usage instructions - [x] RocksDB bindings / integration
- [x] ARCHITECTURE documentation - [x] Core DynamoDB types (AttributeValue / Item / Key / TableDescription, etc.)
- [x] RocksDB FFI bindings (rocksdb/rocksdb.odin) - [x] Binary key codec (varint length-prefixed segments)
- [x] Core types (dynamodb/types.odin) - [x] Binary item codec (TLV encoding / decoding)
- [x] Key codec with varint encoding (key_codec/key_codec.odin) - [x] Storage engine: tables + CRUD + scan/query plumbing
- [x] Main entry point with arena pattern demo - [x] Table-level RW locks (read ops shared / write ops exclusive)
- [x] .gitignore - [x] HTTP server + request routing via `X-Amz-Target`
- [x] HTTP Server Scaffolding - [x] DynamoDB JSON (parse + serialize)
- [x] JSON Parser - [x] Expression parsing for Query key conditions (basic support)
- [x] Item_codec
- [x] Storage
## 🚧 In Progress (Need to Complete) ---
### Core Modules ## Now (MVP correctness + polish)
Goal: “aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query” with correct DynamoDB-ish responses.
- [x] **dynamodb/json.odin** - DynamoDB JSON parsing and serialization ### 1) HTTP + routing hardening
- Parse `{"S": "value"}` format - [ ] Audit request parsing boundaries:
- Serialize AttributeValue to DynamoDB JSON - Max body size enforcement
- Parse request bodies (PutItem, GetItem, etc.) - Missing/invalid headers → correct DynamoDB error types
- Content-Type handling (be permissive but consistent)
- [ ] Ensure **all request-scoped allocations** come from the request arena (no accidental long-lived allocs)
- [ ] Standardize error responses:
- `__type` formatting
- `message` field consistency
- status code mapping per error type
- [x] **item_codec/item_codec.odin** - Binary TLV encoding for items ### 2) Storage correctness edge cases
- Encode Item to binary TLV format - [ ] Table metadata durability + validation:
- Decode binary TLV back to Item - reject duplicate tables
- Type tag handling for all DynamoDB types - reject invalid key schema (no HASH, multiple HASH, etc.)
- [ ] Item validation against key schema:
- missing PK/SK errors
- type mismatch errors (S/N/B)
- [ ] Deterministic encoding tests:
- key codec round-trip
- TLV item encode/decode round-trip (nested maps/lists/sets)
- [x] **dynamodb/storage.odin** - Storage engine with RocksDB ### 3) Query/Scan pagination parity
- Table metadata management - [ ] Make pagination behavior match Zig version + AWS CLI expectations:
- create_table, delete_table, describe_table, list_tables - `Limit`
- put_item, get_item, delete_item - `ExclusiveStartKey`
- scan, query with pagination - `LastEvaluatedKey` generation (and correct key-type reconstruction)
- Table-level RW locks - [ ] Add “golden” pagination tests:
- query w/ sort key ranges
- scan limit + resume loop
### HTTP Server ### 4) Expression parsing reliability
- [ ] Remove brittle string-scanning for `KeyConditionExpression` extraction:
- Parse expression fields via JSON object lookup (handles whitespace/ordering safely)
- [ ] Add validation + better errors for malformed expressions
- [ ] Expand operator coverage as needed (BETWEEN/begins_with already planned)
- [x] **HTTP server implementation (MOSTLY DONE CONSOLIDATED HANDLER INTO MAIN AND HTTO FILES. NO NEED FOR A STAND ALONE HANDLER LIKE WE DID IN ZIG! JUST PLEASE GO OVER WHAT IS THERE!!!)** ---
- Accept TCP connections
- Parse HTTP POST requests
- Read JSON bodies
- Send HTTP responses with headers
- Keep-alive support
- Route X-Amz-Target functions (this was the handler in zig but no need for that crap in odin land)
- handle_create_table, handle_put_item, etc. (this was the handler in zig but no need for that crap in odin land)
- Build responses with proper error handling (this was the handler in zig but no need for that crap in odin land)
- Arena allocator integration
- Options (Why we haven't checked this off yet, we need to make sure we chose the right option as the project grows, might make more sense to impliment different option):
- Use `core:net` directly
- Use C FFI with libmicrohttpd
- Use Odin's vendor:microui (if suitable)
### Expression Parsers (Priority 3) ## Next (feature parity with Zig + API completeness)
### 5) UpdateItem / conditional logic groundwork
- [ ] Implement `UpdateItem` (initially minimal: SET for scalar attrs)
- [ ] Add `ConditionExpression` support for Put/Delete/Update (start with simple comparisons)
- [ ] Define internal “update plan” representation (parsed ops → applied mutations)
- [ ] **KeyConditionExpression parser** ### 6) Response completeness / options
- Tokenizer for expressions - [ ] `ReturnValues` handling where relevant (NONE/ALL_OLD/UPDATED_NEW etc. — even partial support is useful)
- Parse `pk = :pk AND sk > :sk` - [ ] `ProjectionExpression` (return subset of attributes)
- Support begins_with, BETWEEN - [ ] `FilterExpression` (post-query filter for Scan/Query)
- ExpressionAttributeNames/Values
- [ ] **UpdateExpression parser** (later) ### 7) Test coverage / tooling
- SET operations - [ ] Add integration tests mirroring AWS CLI script flows:
- REMOVE operations - create table → put → get → scan → query → delete
- ADD operations - [ ] Add fuzz-ish tests for:
- DELETE operations - JSON parsing robustness
- expression parsing robustness
- TLV decode failure cases (corrupt bytes)
### Credential Support (Priority 4) ---
- [ ] **Support a way to configure AWS compatible credentials.** ## Later (big features)
- This is very important because remember when mongo didn't come with a root password by default and everyone who had the port open to the world got their DB ransomed? Yeah, we don't want that to happen These align with the “Future Enhancements” list in ARCHITECTURE.md.
### Replication Support (Priority 5) ### 8) Secondary indexes
- [ ] Global Secondary Indexes (GSI)
- [ ] Local Secondary Indexes (LSI)
- [ ] Index backfill + write-path maintenance
- [ ] **Build C++ Shim in order to use RocksDB's WAL replication helpers** ### 9) Batch + transactions
- [ ] **Add configurator to set instance as a master or slave node and point to proper Target and Destination IPs** - [ ] BatchWriteItem
- [ ] **Leverage C++ helpers from shim** - [ ] BatchGetItem
- [ ] Transactions (TransactWriteItems / TransactGetItems)
### Subscribe To Changes Feature (Priority LAST [But keep in mind because semantics we decide now will make this easier later]) ### 10) Performance / ops
- [ ] Connection reuse / keep-alive tuning
- [ ] Bloom filters / RocksDB options tuning for common patterns
- [ ] Optional compression policy (LZ4/Zstd knobs)
- [ ] Parallel scan (segment scanning)
- [ ] **Best-effort notifications (Postgres-ish LISTEN/NOTIFY [in-memory pub/sub fanout. If youre not connected, you miss it.])** ---
- Add an in-process “event bus” channels: table-wide, partition-key, item-key, “all”.
- When putItem/deleteItem/updateItem/createTable/... commits successfully publish {op, table, key, timestamp, item?}
- [ ] **Durable change streams (Mongo-ish [append every mutation to a persistent log and let consumers read it with resume tokens.])** ## Replication / WAL
- Create a “changelog” keyspace (There is a C++ shim stubbed out for WAL iteration and applying write batches.)
- Generate a monotonically increasing sequence by using a stable Per-partition sequence cursor - [ ] Implement WAL iterator: `latest_sequence`, `wal_iter_next` returning writebatch blob
- Expose via an API (I prefer publishing to MQTT or SSE) - [ ] Implement apply-writebatch on follower
- [ ] Add a minimal replication test harness (leader generates N ops → follower applies → compare)
## 📋 Testing ---
- [ ] Unit tests for key_codec ## Housekeeping
- [ ] Unit tests for item_codec - [ ] Fix TODO hygiene: keep this file short and “actionable”
- [ ] Unit tests for JSON parsing - [ ] Add a CONTRIBUTING quick checklist (allocator rules, formatting, tests)
- [ ] Integration tests with AWS CLI - [ ] Add “known limitations” section in README (unsupported DynamoDB features)
- [ ] Benchmark suite
## 🔧 Build & Tooling
- [ ] Verify Makefile works on macOS
- [ ] Verify Makefile works on Linux
- [ ] Add Docker support (optional)
- [ ] Add install script
## 📚 Documentation
- [ ] Code comments for public APIs
- [ ] Usage examples in README
- [ ] API compatibility matrix
- [ ] Performance tuning guide
## 🎯 Priority Order
1. **HTTP Server** - Need this to accept requests
2. **JSON Parsing** - Need this to understand DynamoDB format
3. **Storage Engine** - Core CRUD operations
4. **Handlers** - Wire everything together
5. **Item Codec** - Efficient binary storage
6. **Expression Parsers** - Query functionality
## 📝 Notes
### Zig → Odin Translation Patterns
**Memory Management:**
```zig
// Zig
const item = try allocator.create(Item);
defer allocator.destroy(item);
```
```odin
// Odin
item := new(Item)
// No defer needed if using arena
```
**Error Handling:**
```zig
// Zig
fn foo() !Result {
return error.Failed;
}
const x = try foo();
```
```odin
// Odin
foo :: proc() -> (Result, bool) {
return {}, false
}
x := foo() or_return
```
**Slices:**
```zig
// Zig
const slice: []const u8 = data;
```
```odin
// Odin
slice: []byte = data
```
**Maps:**
```zig
// Zig
var map = std.StringHashMap(Value).init(allocator);
defer map.deinit();
```
```odin
// Odin
map := make(map[string]Value)
defer delete(map)
```
### Key Decisions
1. **Use `Maybe(T)` instead of `?T`** - Odin's optional type
2. **Use `or_return` instead of `try`** - Odin's error propagation
3. **Use `context.allocator`** - Implicit allocator from context
4. **Use `#partial switch`** - For union type checking
5. **Use `transmute`** - For zero-cost type conversions
### Reference Zig Files
When implementing, reference these Zig files:
- `src/dynamodb/json.zig` - 400 lines, DynamoDB JSON format
- `src/dynamodb/storage.zig` - 460 lines, storage engine
- `src/dynamodb/handler.zig` - 500+ lines, request handlers
- `src/item_codec.zig` - 350 lines, TLV encoding
- `src/http.zig` - 250 lines, HTTP server
### Quick Test Commands
```bash
# Build and test
make build
make test
# Run server
make run
# Test with AWS CLI
aws dynamodb list-tables --endpoint-url http://localhost:8002
```