|
|
f597247f56
|
Rename vm.vultrlabs.dev → victoriametrics.vultrlabs.dev
|
2026-04-09 19:33:58 +00:00 |
|
|
|
bf6d62b9a8
|
Add VictoriaMetrics for historical metrics (Mar 13+)
- Single-node VM deployment with 200Gi NVMe, 2y retention
- Traefik IngressRoute at vm.vultrlabs.dev (TLS + basic auth)
- Backfill script: pulls vLLM/DCGM metrics from Mimir, writes to VM
- Retain StorageClass so historical data survives PVC deletion
- README with deployment + Grafana mixed-datasource instructions
|
2026-04-09 19:29:18 +00:00 |
|
|
|
7ade5ecac8
|
Clean slate: 1h block sizes, remove backfill artifacts
- Changed all namespace block sizes to 1h (was 2h/12h/24h in manifests,
30d+ in the live cluster due to backfill-era bufferPast hacks)
- Deleted entire backfill/ directory (scripts, pods, runbooks)
- Removed stale 05-m3coordinator.yaml (had backfill namespaces)
- Added 05-m3coordinator-deployment.yaml to kustomization
- Fixed init job health check (/health instead of /api/v1/services/m3db/health)
- Updated .env.example (removed Mimir credentials)
- Added 'Why Backfill Doesn't Work' section to README
|
2026-04-09 19:00:08 +00:00 |
|
|
|
1af29e8f09
|
tweaks with backfill and grafana
|
2026-04-01 15:21:10 +00:00 |
|
|
|
a6c59d6a65
|
Replace LB with Traefik ingress for TLS + basic auth
- Remove m3coordinator LoadBalancer service (was using deprecated AutoSSL)
- Add Traefik ingress controller with Let's Encrypt ACME
- Add basic auth middleware for external access
- Update test scripts with auth support and fixed protobuf encoding
- Add multi-tenancy documentation (label-based isolation)
- Update README with Traefik deployment instructions
|
2026-04-01 05:19:14 +00:00 |
|
|
|
5eb58d1864
|
Update README with working m3db.vultrlabs.dev endpoint
|
2026-04-01 02:44:07 +00:00 |
|
|
|
5f4cd46bc3
|
Add backend-protocol annotation to m3coordinator-lb
LB now properly speaks HTTP to the coordinator backend
|
2026-04-01 02:43:41 +00:00 |
|
|
|
d35cd2d7d4
|
Update test scripts to accept full URL instead of LB_IP
- test-metrics.sh and test-metrics.py now take a full URL with port
- Supports both HTTP and HTTPS endpoints
- Updated README with new usage examples
|
2026-04-01 02:38:47 +00:00 |
|
|
|
a8469f79d7
|
Fix m3dbnode port conflict, update README, fix test script
- Remove duplicate db.metrics section (port 7203 conflict)
- Fix coordinator health endpoint (/health not /api/v1/services/m3db/health)
- Update README: remove NodePort references, always use LoadBalancer
- Add bootstrap instructions (workaround for init job chicken-and-egg)
- Fix test-metrics.sh: correct health endpoint and JSON parsing
|
2026-03-31 15:49:59 +00:00 |
|
|
|
ac13c30905
|
init commit
|
2026-03-31 08:28:16 -04:00 |
|