ac13c309054b3185bade0cc4e8b9ac8346715aa3
M3DB on Vultr Kubernetes Engine
Drop-in Mimir replacement using M3DB for long-term Prometheus metrics storage, deployed on Vultr VKE with Vultr Block Storage CSI.
Architecture
Prometheus ──remote_write──▶ M3 Coordinator (Deployment, 2 replicas)
Grafana ──PromQL query──▶ │
│
┌───────┴───────┐
│ M3DB Nodes │ (StatefulSet, 3 replicas)
│ Vultr Block │ (100Gi SSD per node)
│ Storage │
└───────┬───────┘
│
etcd cluster (StatefulSet, 3 replicas)
Retention Tiers
| Namespace | Resolution | Retention | Use Case |
|---|---|---|---|
default |
raw | 48h | Real-time queries |
agg_10s_30d |
10s | 30 days | Recent dashboards |
agg_1m_1y |
1m | 1 year | Long-term trends/capacity |
Deployment
# 1. Apply everything (except the init job won't succeed until pods are up)
kubectl apply -k .
# 2. Wait for all pods to be Ready
kubectl -n m3db get pods -w
# 3. Once all m3dbnode and m3coordinator pods are Running, the init job
# will bootstrap the cluster (placement + namespaces).
# Monitor it:
kubectl -n m3db logs -f job/m3db-cluster-init
# 4. Verify cluster health
kubectl -n m3db port-forward svc/m3coordinator 7201:7201
curl http://localhost:7201/api/v1/services/m3db/placement
curl http://localhost:7201/api/v1/services/m3db/namespace
Prometheus Configuration (Replacing Mimir)
Update your Prometheus config to point at M3 Coordinator instead of Mimir:
# prometheus.yml
remote_write:
- url: "http://m3coordinator.m3db.svc.cluster.local:7201/api/v1/prom/remote/write"
queue_config:
capacity: 10000
max_shards: 30
max_samples_per_send: 5000
batch_send_deadline: 5s
remote_read:
- url: "http://m3coordinator.m3db.svc.cluster.local:7201/api/v1/prom/remote/read"
read_recent: true
Grafana Datasource
Add a Prometheus datasource in Grafana pointing to:
http://m3coordinator.m3db.svc.cluster.local:7201
All existing PromQL dashboards will work without modification.
Migration from Mimir
- Dual-write phase: Configure Prometheus to remote_write to both Mimir and M3DB simultaneously.
- Validation: Compare query results between Mimir and M3DB for the same time ranges.
- Cutover: Once retention in M3DB covers your needs, remove the Mimir remote_write target.
- Cleanup: Decommission Mimir components.
Tuning for Vultr
- Storage: The
vultr-block-storage-m3dbStorageClass useshigh_perf(NVMe SSD). Adjuststoragein the VolumeClaimTemplates based on your cardinality and retention. - Node sizing: M3DB is memory-hungry. Recommend at least 8GB RAM nodes on Vultr. The manifest requests 4Gi per m3dbnode pod.
- Shards: The init job creates 64 shards across 3 nodes. For higher cardinality, increase to 128 or 256.
- Volume expansion: The StorageClass has
allowVolumeExpansion: true— you can resize PVCs online viakubectl edit pvc.
Useful Commands
# Check placement
curl http://localhost:7201/api/v1/services/m3db/placement | jq
# Check namespace readiness
curl http://localhost:7201/api/v1/services/m3db/namespace/ready \
-d '{"name":"default"}'
# Write a test metric
curl -X POST http://localhost:7201/api/v1/prom/remote/write \
-H "Content-Type: application/x-protobuf"
# Query via PromQL
curl "http://localhost:7201/api/v1/query?query=up"
# Delete the init job to re-run (if needed)
kubectl -n m3db delete job m3db-cluster-init
kubectl apply -f 06-init-and-pdb.yaml
Description
Languages
Python
100%