refresh todo

This commit is contained in:
2026-02-15 15:30:24 -05:00
parent 7a2f26b75d
commit c1f72cee8b

281
TODO.md
View File

@@ -1,209 +1,116 @@
# JormunDB Implementation TODO
# JormunDB (Odin rewrite) — TODO
This tracks the rewrite from Zig to Odin and remaining features.
This tracks the rewrite from Zig (ZynamoDB) → Odin (JormunDB), and whats left to stabilize + extend.
## ✅ Completed
## Status Snapshot
- [x] Project structure
- [x] Makefile with build/run/test targets
- [x] README with usage instructions
- [x] ARCHITECTURE documentation
- [x] RocksDB FFI bindings (rocksdb/rocksdb.odin)
- [x] Core types (dynamodb/types.odin)
- [x] Key codec with varint encoding (key_codec/key_codec.odin)
- [x] Main entry point with arena pattern demo
- [x] .gitignore
- [x] HTTP Server Scaffolding
- [x] JSON Parser
- [x] Item_codec
- [x] Storage
### ✅ Ported / Working (core)
- [x] Project layout + Makefile targets (build/run/test/fmt)
- [x] RocksDB bindings / integration
- [x] Core DynamoDB types (AttributeValue / Item / Key / TableDescription, etc.)
- [x] Binary key codec (varint length-prefixed segments)
- [x] Binary item codec (TLV encoding / decoding)
- [x] Storage engine: tables + CRUD + scan/query plumbing
- [x] Table-level RW locks (read ops shared / write ops exclusive)
- [x] HTTP server + request routing via `X-Amz-Target`
- [x] DynamoDB JSON (parse + serialize)
- [x] Expression parsing for Query key conditions (basic support)
## 🚧 In Progress (Need to Complete)
---
### Core Modules
## Now (MVP correctness + polish)
Goal: “aws cli works reliably for CreateTable/ListTables/PutItem/GetItem/DeleteItem/Scan/Query” with correct DynamoDB-ish responses.
- [x] **dynamodb/json.odin** - DynamoDB JSON parsing and serialization
- Parse `{"S": "value"}` format
- Serialize AttributeValue to DynamoDB JSON
- Parse request bodies (PutItem, GetItem, etc.)
### 1) HTTP + routing hardening
- [ ] Audit request parsing boundaries:
- Max body size enforcement
- Missing/invalid headers → correct DynamoDB error types
- Content-Type handling (be permissive but consistent)
- [ ] Ensure **all request-scoped allocations** come from the request arena (no accidental long-lived allocs)
- [ ] Standardize error responses:
- `__type` formatting
- `message` field consistency
- status code mapping per error type
- [x] **item_codec/item_codec.odin** - Binary TLV encoding for items
- Encode Item to binary TLV format
- Decode binary TLV back to Item
- Type tag handling for all DynamoDB types
### 2) Storage correctness edge cases
- [ ] Table metadata durability + validation:
- reject duplicate tables
- reject invalid key schema (no HASH, multiple HASH, etc.)
- [ ] Item validation against key schema:
- missing PK/SK errors
- type mismatch errors (S/N/B)
- [ ] Deterministic encoding tests:
- key codec round-trip
- TLV item encode/decode round-trip (nested maps/lists/sets)
- [x] **dynamodb/storage.odin** - Storage engine with RocksDB
- Table metadata management
- create_table, delete_table, describe_table, list_tables
- put_item, get_item, delete_item
- scan, query with pagination
- Table-level RW locks
### 3) Query/Scan pagination parity
- [ ] Make pagination behavior match Zig version + AWS CLI expectations:
- `Limit`
- `ExclusiveStartKey`
- `LastEvaluatedKey` generation (and correct key-type reconstruction)
- [ ] Add “golden” pagination tests:
- query w/ sort key ranges
- scan limit + resume loop
### HTTP Server
### 4) Expression parsing reliability
- [ ] Remove brittle string-scanning for `KeyConditionExpression` extraction:
- Parse expression fields via JSON object lookup (handles whitespace/ordering safely)
- [ ] Add validation + better errors for malformed expressions
- [ ] Expand operator coverage as needed (BETWEEN/begins_with already planned)
- [x] **HTTP server implementation (MOSTLY DONE CONSOLIDATED HANDLER INTO MAIN AND HTTO FILES. NO NEED FOR A STAND ALONE HANDLER LIKE WE DID IN ZIG! JUST PLEASE GO OVER WHAT IS THERE!!!)**
- Accept TCP connections
- Parse HTTP POST requests
- Read JSON bodies
- Send HTTP responses with headers
- Keep-alive support
- Route X-Amz-Target functions (this was the handler in zig but no need for that crap in odin land)
- handle_create_table, handle_put_item, etc. (this was the handler in zig but no need for that crap in odin land)
- Build responses with proper error handling (this was the handler in zig but no need for that crap in odin land)
- Arena allocator integration
- Options (Why we haven't checked this off yet, we need to make sure we chose the right option as the project grows, might make more sense to impliment different option):
- Use `core:net` directly
- Use C FFI with libmicrohttpd
- Use Odin's vendor:microui (if suitable)
---
### Expression Parsers (Priority 3)
## Next (feature parity with Zig + API completeness)
### 5) UpdateItem / conditional logic groundwork
- [ ] Implement `UpdateItem` (initially minimal: SET for scalar attrs)
- [ ] Add `ConditionExpression` support for Put/Delete/Update (start with simple comparisons)
- [ ] Define internal “update plan” representation (parsed ops → applied mutations)
- [ ] **KeyConditionExpression parser**
- Tokenizer for expressions
- Parse `pk = :pk AND sk > :sk`
- Support begins_with, BETWEEN
- ExpressionAttributeNames/Values
### 6) Response completeness / options
- [ ] `ReturnValues` handling where relevant (NONE/ALL_OLD/UPDATED_NEW etc. — even partial support is useful)
- [ ] `ProjectionExpression` (return subset of attributes)
- [ ] `FilterExpression` (post-query filter for Scan/Query)
- [ ] **UpdateExpression parser** (later)
- SET operations
- REMOVE operations
- ADD operations
- DELETE operations
### 7) Test coverage / tooling
- [ ] Add integration tests mirroring AWS CLI script flows:
- create table → put → get → scan → query → delete
- [ ] Add fuzz-ish tests for:
- JSON parsing robustness
- expression parsing robustness
- TLV decode failure cases (corrupt bytes)
### Credential Support (Priority 4)
---
- [ ] **Support a way to configure AWS compatible credentials.**
- This is very important because remember when mongo didn't come with a root password by default and everyone who had the port open to the world got their DB ransomed? Yeah, we don't want that to happen
## Later (big features)
These align with the “Future Enhancements” list in ARCHITECTURE.md.
### Replication Support (Priority 5)
### 8) Secondary indexes
- [ ] Global Secondary Indexes (GSI)
- [ ] Local Secondary Indexes (LSI)
- [ ] Index backfill + write-path maintenance
- [ ] **Build C++ Shim in order to use RocksDB's WAL replication helpers**
- [ ] **Add configurator to set instance as a master or slave node and point to proper Target and Destination IPs**
- [ ] **Leverage C++ helpers from shim**
### 9) Batch + transactions
- [ ] BatchWriteItem
- [ ] BatchGetItem
- [ ] Transactions (TransactWriteItems / TransactGetItems)
### Subscribe To Changes Feature (Priority LAST [But keep in mind because semantics we decide now will make this easier later])
### 10) Performance / ops
- [ ] Connection reuse / keep-alive tuning
- [ ] Bloom filters / RocksDB options tuning for common patterns
- [ ] Optional compression policy (LZ4/Zstd knobs)
- [ ] Parallel scan (segment scanning)
- [ ] **Best-effort notifications (Postgres-ish LISTEN/NOTIFY [in-memory pub/sub fanout. If youre not connected, you miss it.])**
- Add an in-process “event bus” channels: table-wide, partition-key, item-key, “all”.
- When putItem/deleteItem/updateItem/createTable/... commits successfully publish {op, table, key, timestamp, item?}
---
- [ ] **Durable change streams (Mongo-ish [append every mutation to a persistent log and let consumers read it with resume tokens.])**
- Create a “changelog” keyspace
- Generate a monotonically increasing sequence by using a stable Per-partition sequence cursor
- Expose via an API (I prefer publishing to MQTT or SSE)
## Replication / WAL
(There is a C++ shim stubbed out for WAL iteration and applying write batches.)
- [ ] Implement WAL iterator: `latest_sequence`, `wal_iter_next` returning writebatch blob
- [ ] Implement apply-writebatch on follower
- [ ] Add a minimal replication test harness (leader generates N ops → follower applies → compare)
## 📋 Testing
---
- [ ] Unit tests for key_codec
- [ ] Unit tests for item_codec
- [ ] Unit tests for JSON parsing
- [ ] Integration tests with AWS CLI
- [ ] Benchmark suite
## 🔧 Build & Tooling
- [ ] Verify Makefile works on macOS
- [ ] Verify Makefile works on Linux
- [ ] Add Docker support (optional)
- [ ] Add install script
## 📚 Documentation
- [ ] Code comments for public APIs
- [ ] Usage examples in README
- [ ] API compatibility matrix
- [ ] Performance tuning guide
## 🎯 Priority Order
1. **HTTP Server** - Need this to accept requests
2. **JSON Parsing** - Need this to understand DynamoDB format
3. **Storage Engine** - Core CRUD operations
4. **Handlers** - Wire everything together
5. **Item Codec** - Efficient binary storage
6. **Expression Parsers** - Query functionality
## 📝 Notes
### Zig → Odin Translation Patterns
**Memory Management:**
```zig
// Zig
const item = try allocator.create(Item);
defer allocator.destroy(item);
```
```odin
// Odin
item := new(Item)
// No defer needed if using arena
```
**Error Handling:**
```zig
// Zig
fn foo() !Result {
return error.Failed;
}
const x = try foo();
```
```odin
// Odin
foo :: proc() -> (Result, bool) {
return {}, false
}
x := foo() or_return
```
**Slices:**
```zig
// Zig
const slice: []const u8 = data;
```
```odin
// Odin
slice: []byte = data
```
**Maps:**
```zig
// Zig
var map = std.StringHashMap(Value).init(allocator);
defer map.deinit();
```
```odin
// Odin
map := make(map[string]Value)
defer delete(map)
```
### Key Decisions
1. **Use `Maybe(T)` instead of `?T`** - Odin's optional type
2. **Use `or_return` instead of `try`** - Odin's error propagation
3. **Use `context.allocator`** - Implicit allocator from context
4. **Use `#partial switch`** - For union type checking
5. **Use `transmute`** - For zero-cost type conversions
### Reference Zig Files
When implementing, reference these Zig files:
- `src/dynamodb/json.zig` - 400 lines, DynamoDB JSON format
- `src/dynamodb/storage.zig` - 460 lines, storage engine
- `src/dynamodb/handler.zig` - 500+ lines, request handlers
- `src/item_codec.zig` - 350 lines, TLV encoding
- `src/http.zig` - 250 lines, HTTP server
### Quick Test Commands
```bash
# Build and test
make build
make test
# Run server
make run
# Test with AWS CLI
aws dynamodb list-tables --endpoint-url http://localhost:8002
```
## Housekeeping
- [ ] Fix TODO hygiene: keep this file short and “actionable”
- [ ] Add a CONTRIBUTING quick checklist (allocator rules, formatting, tests)
- [ ] Add “known limitations” section in README (unsupported DynamoDB features)