Files
m3db-vke-setup/README.md

181 lines
8.3 KiB
Markdown
Raw Normal View History

2026-03-31 08:28:16 -04:00
# M3DB on Vultr Kubernetes Engine
Drop-in Mimir replacement using M3DB for long-term Prometheus metrics storage, deployed on Vultr VKE with Vultr Block Storage CSI.
## Architecture
```
┌─────────────────────────────────────────────────────┐
│ Vultr VKE Cluster │
│ │
External Prometheus ─┼──remote_write──▶ Vultr LoadBalancer (m3coordinator-lb)
External Grafana ─┼──PromQL query──▶ │ (managed, provisioned by CCM)
│ │
In-cluster Prometheus┼──remote_write──▶ M3 Coordinator (Deployment, 2 replicas)
In-cluster Grafana ┼──PromQL query──▶ │
│ │
│ ┌───────┴───────┐
│ │ M3DB Nodes │ (StatefulSet, 3 replicas)
│ │ Vultr Block │ (100Gi NVMe per node)
│ │ Storage │
│ └───────┬───────┘
│ │
│ etcd cluster (StatefulSet, 3 replicas)
└─────────────────────────────────────────────────────┘
2026-03-31 08:28:16 -04:00
```
## Retention Tiers
| Namespace | Resolution | Retention | Use Case |
|----------------|-----------|-----------|---------------------------|
| `default` | raw | 48h | Real-time queries |
| `agg_10s_30d` | 10s | 30 days | Recent dashboards |
| `agg_1m_1y` | 1m | 1 year | Long-term trends/capacity |
## Deployment
```bash
# 1. Apply everything
2026-03-31 08:28:16 -04:00
kubectl apply -k .
# 2. Wait for all pods to be Running
2026-03-31 08:28:16 -04:00
kubectl -n m3db get pods -w
# 3. Bootstrap the cluster (placement + namespaces)
# The init job waits for coordinator health, which requires m3db to be bootstrapped.
# Bootstrap directly via m3dbnode's embedded coordinator:
kubectl -n m3db exec m3dbnode-0 -- curl -s -X POST http://localhost:7201/api/v1/services/m3db/placement/init \
-H "Content-Type: application/json" -d '{
"num_shards": 64,
"replication_factor": 3,
"instances": [
{"id": "m3dbnode-0", "isolation_group": "zone-a", "zone": "embedded", "weight": 100, "endpoint": "m3dbnode-0.m3dbnode.m3db.svc.cluster.local:9000", "hostname": "m3dbnode-0", "port": 9000},
{"id": "m3dbnode-1", "isolation_group": "zone-b", "zone": "embedded", "weight": 100, "endpoint": "m3dbnode-1.m3dbnode.m3db.svc.cluster.local:9000", "hostname": "m3dbnode-1", "port": 9000},
{"id": "m3dbnode-2", "isolation_group": "zone-c", "zone": "embedded", "weight": 100, "endpoint": "m3dbnode-2.m3dbnode.m3db.svc.cluster.local:9000", "hostname": "m3dbnode-2", "port": 9000}
]
}'
kubectl -n m3db exec m3dbnode-0 -- curl -s -X POST http://localhost:7201/api/v1/services/m3db/namespace \
-H "Content-Type: application/json" -d '{"name":"default","options":{"bootstrapEnabled":true,"flushEnabled":true,"writesToCommitLog":true,"cleanupEnabled":true,"snapshotEnabled":true,"repairEnabled":false,"retentionOptions":{"retentionPeriodDuration":"48h","blockSizeDuration":"2h","bufferFutureDuration":"10m","bufferPastDuration":"10m"},"indexOptions":{"enabled":true,"blockSizeDuration":"2h"}}}'
kubectl -n m3db exec m3dbnode-0 -- curl -s -X POST http://localhost:7201/api/v1/services/m3db/namespace \
-H "Content-Type: application/json" -d '{"name":"agg_10s_30d","options":{"bootstrapEnabled":true,"flushEnabled":true,"writesToCommitLog":true,"cleanupEnabled":true,"snapshotEnabled":true,"retentionOptions":{"retentionPeriodDuration":"720h","blockSizeDuration":"12h","bufferFutureDuration":"10m","bufferPastDuration":"10m"},"indexOptions":{"enabled":true,"blockSizeDuration":"12h"},"aggregationOptions":{"aggregations":[{"aggregated":true,"attributes":{"resolutionDuration":"10s"}}]}}}'
kubectl -n m3db exec m3dbnode-0 -- curl -s -X POST http://localhost:7201/api/v1/services/m3db/namespace \
-H "Content-Type: application/json" -d '{"name":"agg_1m_1y","options":{"bootstrapEnabled":true,"flushEnabled":true,"writesToCommitLog":true,"cleanupEnabled":true,"snapshotEnabled":true,"retentionOptions":{"retentionPeriodDuration":"8760h","blockSizeDuration":"24h","bufferFutureDuration":"10m","bufferPastDuration":"10m"},"indexOptions":{"enabled":true,"blockSizeDuration":"24h"},"aggregationOptions":{"aggregations":[{"aggregated":true,"attributes":{"resolutionDuration":"1m"}}]}}}'
# 4. Wait for bootstrapping to complete (check shard state = AVAILABLE)
kubectl -n m3db exec m3dbnode-0 -- curl -s http://localhost:9002/health
# 5. Get the LoadBalancer IP
kubectl -n m3db get svc m3coordinator-lb
```
## Testing
2026-03-31 08:28:16 -04:00
**Quick connectivity test:**
```bash
./test-metrics.sh <LB_IP>
```
This script verifies:
1. Coordinator health endpoint responds
2. Placement is configured with all 3 m3dbnode instances
3. All 3 namespaces are created (default, agg_10s_30d, agg_1m_1y)
4. PromQL queries work
**Full read/write test (Python):**
```bash
pip install requests python-snappy
python3 test-metrics.py <LB_IP>
2026-03-31 08:28:16 -04:00
```
Writes a test metric via Prometheus remote_write and reads it back.
2026-03-31 08:28:16 -04:00
## Prometheus Configuration (Replacing Mimir)
Update your Prometheus config to point at M3 Coordinator.
2026-03-31 08:28:16 -04:00
**In-cluster (same VKE cluster):**
2026-03-31 08:28:16 -04:00
```yaml
# prometheus.yml
remote_write:
- url: "http://m3coordinator.m3db.svc.cluster.local:7201/api/v1/prom/remote/write"
queue_config:
capacity: 10000
max_shards: 30
max_samples_per_send: 5000
batch_send_deadline: 5s
remote_read:
- url: "http://m3coordinator.m3db.svc.cluster.local:7201/api/v1/prom/remote/read"
read_recent: true
```
**External (cross-region/cross-cluster):**
```yaml
# prometheus.yml
remote_write:
- url: "http://<LB-IP>:7201/api/v1/prom/remote/write"
queue_config:
capacity: 10000
max_shards: 30
max_samples_per_send: 5000
batch_send_deadline: 5s
remote_read:
- url: "http://<LB-IP>:7201/api/v1/prom/remote/read"
read_recent: true
```
Get the LoadBalancer IP:
```bash
kubectl -n m3db get svc m3coordinator-lb
```
2026-03-31 08:28:16 -04:00
## Grafana Datasource
Add a **Prometheus** datasource in Grafana pointing to:
- **In-cluster:** `http://m3coordinator.m3db.svc.cluster.local:7201`
- **External:** `http://<LB-IP>:7201`
2026-03-31 08:28:16 -04:00
All existing PromQL dashboards will work without modification.
## Migration from Mimir
1. **Dual-write phase**: Configure Prometheus to remote_write to both Mimir and M3DB simultaneously.
2. **Validation**: Compare query results between Mimir and M3DB for the same time ranges.
3. **Cutover**: Once retention in M3DB covers your needs, remove the Mimir remote_write target.
4. **Cleanup**: Decommission Mimir components.
## Tuning for Vultr
- **Storage**: The `vultr-block-storage-m3db` StorageClass uses `disk_type: nvme` (NVMe SSD). Adjust `storage` in the VolumeClaimTemplates based on your cardinality and retention.
2026-03-31 08:28:16 -04:00
- **Node sizing**: M3DB is memory-hungry. Recommend at least 8GB RAM nodes on Vultr. The manifest requests 4Gi per m3dbnode pod.
- **Shards**: The init job creates 64 shards across 3 nodes. For higher cardinality, increase to 128 or 256.
- **Volume expansion**: The StorageClass has `allowVolumeExpansion: true` — you can resize PVCs online via `kubectl edit pvc`.
## Useful Commands
```bash
# Get LoadBalancer IP
kubectl -n m3db get svc m3coordinator-lb
# Check cluster health (from inside cluster)
kubectl -n m3db exec m3dbnode-0 -- curl -s http://m3coordinator.m3db.svc.cluster.local:7201/health
2026-03-31 08:28:16 -04:00
# Check placement (from inside cluster)
kubectl -n m3db exec m3dbnode-0 -- curl -s http://m3coordinator.m3db.svc.cluster.local:7201/api/v1/services/m3db/placement | jq
2026-03-31 08:28:16 -04:00
# Check m3dbnode bootstrapped status
kubectl -n m3db exec m3dbnode-0 -- curl -s http://localhost:9002/health
2026-03-31 08:28:16 -04:00
# Query via PromQL (external)
curl "http://<LB-IP>:7201/api/v1/query?query=up"
2026-03-31 08:28:16 -04:00
# Delete the init job to re-run (if needed)
kubectl -n m3db delete job m3db-cluster-init
kubectl apply -f 06-init-and-pdb.yaml
```